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\Q (54) Title: PROTEIN -PROTEIN INTERACTIONS 

|£j (57) Abstract: The present invention relates to the discovery of novel protein-protein interactions that are involved in mammalian 
physiological pathways, including physiological disorders or diseases. Examples of physiological disorders and diseases include 
non-insulin dependent diabetes mellitus (NTDDM), neurodegenerative disorders, such as Alzheimer's Disease (AD), and the like. 
Thus, the present invention is directed to complexes of these proteins and/or their fragments, antibodies to the complexes, diagnosis 
of physiological generative disorders (including diagnosis of a predisposition to and diagnosis of the existence of the disorder), drug 
screening for agents which modulate the interaction of proteins described herein, and identification of additional proteins in the 
pathway common to the proteins described herein. 
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PROTEIN-PROTEIN INTERACTIONS 

BACKGROUND OF THE INVENTION 
5 [0001] The present invention relates to the discovery of novel protein-protein interactions 

that are involved in mammalian physiological pathways, including physiological disorders or 
diseases. Examples of physiological disorders and diseases include non-insulin dependent diabetes 
mellitus (NIDDM), neurodegenerative disorders, such as Alzheimer's Disease (AD), and the like. 
Thus, the present invention is directed to complexes of these proteins and/or their fragments, 

10 antibodies to the complexes, diagnosis of physiological generative disorders (including diagnosis 
of a predisposition to and diagnosis of the existence of the disorder), drug screening for agents 
which modulate the interaction of proteins described herein, and identification of additional proteins 
in the pathway common to the proteins described herein. 

[0002] The publications and other materials used herein to illuminate the background of the 

15 invention, and in particular, cases to provide additional details respecting the practice, are 
incorporated herein by reference, and for convenience, are referenced by author and date in the 
following text and respectively grouped in the appended Bibliography. 

[0003] Many processes in biology, including transcription, translation and metabolic or 
signal transduction pathways, are mediated by non-covalently associated protein complexes. The 

20 formation of protein-protein complexes or protein-DNA complexes produce the most efficient 
chemical machinery. Much of modem biological research is concerned with identifying, proteins 
involved in cellular processes, determining their functions, and how, when and where they interact 
with other proteins involved in specific pathways. Further, with rapid advances in genome 
sequencing, there is a need to define protein linkage maps, i.e., detailed inventories of protein 

25 interactions that make up functional assemblies of proteins or protein complexes or that make up 
physiological pathways. 

[0004] Recent advances in human genomics research has led to rapid progress in the 
identification of novel genes. In applications to biological and pharmaceutical research, there is a 
need to determine functions of gene products. A first step in defining the function of a novel gene 

30 is to determine its interactions with other gene products in appropriate context. That is, since 
proteins make specific interactions with other proteins or other biopolymers as part of functional 
assemblies or physiological pathways, an appropriate way to examine function of a gene is to 
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determine its physical relationship with other genes. Several systems exist for identifying protein 
interactions and hence relationships between genes. 

[0005] There continues to be a need in the art for the discovery of additional protein-protein 
interactions involved in mammalian physiological pathways. There continues to be a need in the 
5 art also to identify the protein-protein interactions that are involved in mammalian physiological 
disorders and diseases, and to thus identify drug targets. 

SUMMARY OF THE INVENTION 

[0006] The present invention relates to the discovery of protein-protein interactions that are 

10 involved in mammalian physiological pathways, including physiological disorders or diseases, and 
to the use of this discovery. The identification of the interacting proteins described herein provide 
new targets for the identification of useful pharmaceuticals, new targets for diagnostic tools in the 
identification of individuals at risk, sequences for production of transformed cell lines, cellular 
models and animal models, and new bases for therapeutic intervention in such physiological 

15 pathways 

[0007] Thus, one aspect of the present invention is protein complexes. The protein 
complexes are a complex of (a) two interacting proteins, (b) a first interacting protein and a fragment 
of a second interacting protein, (c) a fragment of a first interacting protein and a second interacting 
protein, or (d) a fragment of a first interacting protein and a fragment of a second interacting protein. 

20 The fragments of the interacting proteins include those parts of the proteins, which interact to form 
a complex. This aspect of the invention includes the detection of protein interactions and the 
production of proteins by recombinant techniques. The latter embodiment also includes cloned 
sequences, vectors, transfected or transformed host cells and transgenic animals. 

[0008] A second aspect of the present invention is an antibody that is immunoreactive with 

25 the above complex. The antibody may be a polyclonal antibody or a monoclonal antibody. While 
the antibody is immunoreactive with the complex, it is not immunoreactive with the component 
parts of the complex. That is, the antibody is not immunoreactive with a first interactive protein, 
a fragment of a first interacting protein, a second interacting protein or a fragment of a second 
interacting protein. Such antibodies can be used to detect the presence or absence of the protein 

30 complexes. 

[0009] A third aspect of the present invention is a method for diagnosing a predisposition 
for physiological disorders or diseases in a human or other animal. The diagnosis of such disorders 
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includes a diagnosis of a predisposition to the disorders and a diagnosis for the existence of the 
disorders. In accordance with this method, the ability of a first interacting protein or fragment 
thereof to form a complex with a second interacting protein or a fragment thereof is assayed, or the 
genes encoding interacting proteins are screened for mutations in interacting portions of the protein 
5 molecules. The inability of a first interacting protein or fragment thereof to form a complex, or the 
presence of mutations in a gene within the interacting domain, is indicative of a predisposition to, 
or existence of a disorder. In accordance with one embodiment of the invention, the ability to form 
a complex is assayed in a two-hybrid assay. In a first aspect of this embodiment, the ability to form 
a complex is assayed by a yeast two-hybrid assay. In a second aspect, the ability to form a complex 

10 is assayed by a mammalian two-hybrid assay. In a second embodiment, the ability to form a 
complex is assayed by measuring in vitro a complex formed by combining said first protein and said 
second protein. In one aspect the proteins are isolated from a human or other animal. In a third 
embodiment, the ability to form a complex is assayed by measuring the binding of an antibody, 
which is specific for the complex. In a fourth embodiment, the ability to form a complex is assayed 

15 by measuring the binding of an antibody that is specific for the complex with a tissue extract from 
a human or other animal. In a fifth embodiment, coding sequences of the interacting proteins 
described herein are screened for mutations. 

[0010] A fourth aspect of the present invention is a method for screening for drug candidates 
which are capable 9f modulating the interaction of a first interacting protein and a second interacting 

20 protein. In this method, the amount of the complex formed in the presence of a drug is compared 
with the amount of the complex formed in the absence of the drug. If the amount of complex 
formed in the presence of the drug is greater than or less than the amount of complex formed in the 
absence of the drug, the drug is a candidate for modulating the interaction of the first and second 
interacting proteins. The drug promotes the interaction if the complex formed in the presence of the 

25 drug is greater and inhibits (or disrupts) the interaction if the complex formed in the presence of the 
drug is less. The drug may affect the interaction directly, i.e., by modulating the binding of the two 
proteins, or indirectly, e.g., by modulating the expression of one or both of the proteins. 

[001 1] A fifth aspect of the present invention is a model for such physiological pathways, 
disorders or diseases. The model may be a cellular model or an animal model, as further described 

30 herein. In accordance with one embodiment of the invention, an animal model is prepared by 
creating transgenic or "knock-out" animals. The knock-out may be a total knock-out, i.e., the 
desired gene is deleted, or a conditional knock-out, i.e., the gene is active until it is knocked out at 
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a determined time. In a second embodiment, a cell line is derived from such animals for use as a 
model. In a third embodiment, an animal model is prepared in which the biological activity of a 
protein complex of the present invention has been altered. In one aspect, the biological activity is 
altered by disrupting the formation of the protein complex, such as by the binding of an antibody 
5 or small molecule to one of the proteins which prevents the formation of the protein complex. In 
a second aspect, the biological activity of a protein complex is altered by disrupting the action of 
the complex, such as by the binding of an antibody or small molecule to the protein complex which 
interferes with the action of the protein complex as described herein. In a fourth embodiment, a cell 
model is prepared by altering the genome of the cells in a cell line. In one aspect, the genome of 
10 the cells is modified to produce at least one protein complex described herein. In a second aspect, 
the genome of the cells is modified to eliminate at least one protein of the protein complexes 
described herein. 

[0012] A sixth aspect of the present invention are nucleic acids coding for novel proteins 
discovered in accordance with the present invention and the corresponding proteins and antibodies. 

15 [0013] A seventh aspect of the present invention is a method of screening for drug 

candidates useful for treating a physiological disorder. In this embodiment, drugs are screened on 
the basis of the association of a protein with a particular physiological disorder. This association 
is established in accordance with the present invention by identifying a relationship of the protein 
with a particular physiological disorder. The drugs are screened by comparing the activity of the 

20 protein in the presence and absence of the drug. If a difference in activity is found, then the drug 
is a drug candidate for the physiological disorder. The activity of the protein can be assayed in vitro 
or in vivo using conventional techniques, including transgenic animals and cell lines of the present 
invention. 

25 DETAILED DESCRIPTION OF THE INVENTION 

[0014] The present invention is the discovery of novel interactions between proteins 

described herein. The genes coding for some of these proteins may have been cloned previously, 

but their potential interaction in a physiological pathway or with a particular protein was unknown. 

Alternatively, the genes coding for some of these proteins have not been cloned previously and 
30 represent novel genes. These proteins are identified using the yeast two-hybrid method and 

searching a human total brain library, as more fully described below. 
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[0015] According to the present invention, new protein-protein interactions have been 
discovered. The discovery of these interactions has identified several protein complexes for each 
protein-protein interaction. The protein complexes for these interactions are set forth below in 
Tables 1-10, which also identifies the new protein-protein interactions of the present invention. 

TABLE 1 

Protein Complexes LXR-abha/Utrophin Interaction 
Oxysterol liver X receptor alpha (LXR-alpha) and utrophin 
A fragment of LXR-alpha and utrophin 
LXR-alpha and a fragment of utrophin 
A fragment of LXR-alpha and a fragment of utrophin 

TABLE 2 

Protein Complexes LXR-abha/zvxin Interaction 
Oxysterol liver X receptor alpha (LXR-alpha) and zyxin 
A fragment of LXR-alpha and zyxin 
LXR-alpha and a fragment of zyxin 
A fragment of LXR-alpha and a fragment of zyxin 

TABLE 3 

Protein Complexes LXR-alpha/LIMSl Interaction 
Oxysterol liver X receptor alpha (LXR-alpha) and LMS1 
A fragment of LXR-alpha and LIMS1 
LXR-alpha and a fragment of LEMS1 
A fragment of LXR-alpha and a fragment of LIMS1 
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TABLE 4 

Protein Complexes LXR-alpha/PN7771 Interaction 
Oxysterol liver X receptor alpha (LXR-alpha) and PN7771 
A fragment of LXR-alpha and PN7771 
LXR-alpha and a fragment of PN7771 
A fragment of LXR-alpha and a fragment of PN7771 

TABLE 5 

Protein Complexes LXR-alpha/Homer-3 Interaction 
Oxysterol liver X receptor alpha (LXR-alpha) and Homer-3 
A fragment of LXR-alpha and Homer-3 
LXR-alpha and a fragment of Homer-3 
A fragment of LXR-alpha and a fragment of Homer-3 

TABLE 6 

Protein Complexes LXR-alpha/RACKl Interaction 
Oxysterol liver X receptor alpha (LXR-alpha) and RACK1 
A fragment of LXR-alpha and RACK1 
LXR-alpha and a fragment of RACK1 
A fragment of LXR-alpha and a fragment of RACK1 

TABLE 7 

Protein Complexes LXR-alpha/EIF3Sl Interaction 
Oxysterol liver X receptor alpha (LXR-alpha) and EIF3S1 
A fragment of LXR-alpha and EIF3S1 
LXR-alpha and a fragment of EIF3S1 
A fragment of LXR-alpha and a fragment of EIF3S1 
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TABLE 8 

Protein Complexes LXR-alpha/PSMDll Interaction 
Oxysterol liver X receptor alpha (LXR-alpha) and PSMD11 
A fragment of LXR-alpha and PSMD1 1 
LXR-alpha and a fragment of PSMD11 
A fragment of LXR-alpha and a fragment of PSMD1 1 

TABLE 9 

Protein Complexes LXR-alpha/KIAA0610 Interaction 
Oxysterol liver X receptor alpha (LXR-alpha) and KIAA0610 
A fragment of LXR-alpha and KIAA0610 
LXR-alpha and a fragment of KIAA0610 
A fragment of LXR-alpha and a fragment of KIAA0610 

TABLE 10 

Protein Complexes LXR-alpha/CIR Interaction 
Oxysterol liver X receptor alpha (LXR-alpha) and CIR 
A fragment of LXR-alpha and CIR 
LXR-alpha and a fragment of CIR 
A fragment of LXR-alpha and a fragment of CIR 

[0016] The involvement of above interactions in particular pathways is as follows. 

[0017] Many cellular proteins exert their function by interacting with other proteins in the 
cell. Examples of this are found in the formation of multiprotein complexes and the association of 
enzymes with their substrates. It is widely believed that a . great deal of information can be gained 
by understanding individual protein-protein interactions, and that this is useful in identifying 
complex networks of interacting proteins that participate in the workings of normal cellular 
functions. Ultimately, the knowledge gained by characterizing these networks can lead to valuable 
insight into the causes of human diseases and can eventually lead to the development of therapeutic 
strategies. The yeast two-hybrid assay is a powerful tool for determining protein-protein 
interactions and it has been successfully used for studying human disease pathways. In one 
variation of this technique, a protein of interest (or a portion of that protein) is expressed in a 
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population of yeast cells that collectively contain all protein sequences. Yeast cells that possess 
protein sequences that interact with the protein of interest are then genetically selected, and the 
identity of those interacting proteins are determined by DNA sequencing. Thus, proteins that can 
be demonstrated to interact with a protein known to be involved in a human disease are therefore 
5 also implicated in that disease. Proteins identified in the first round of two-hybrid screening can be 
subsequently used in a second round of two-hybrid screening, allowing the identification of multiple 
proteins in the complex network of interactions in a disease pathway. 

[0018] Nuclear hormone receptors play important roles in development, reproduction, and 
physiology by altering gene transcription in response to hormonal signals (Whitfield et al., 1999; 

10 Klein-Hitpass et al, 1998). Misregulation of hormone receptor signaling pathways is responsible 
for a variety of diseases. For example, aldosterone and its receptor (the mineralocorticoid receptor, 
MCR) are involved in hypertension and congestive heart failure (Duprez et al, 2000), and it has 
recently been shown that a missense mutation in MCR that alters its ligand specificity is responsible 
for pregnancy-exacerbated hypertension (Geller et al., 2000). Likewise, glucocorticoids and the 

1 5 glucocorticoid receptor (GR) have been implicated in chronic inflammation and arthritis (Banres, 
1998), and the oxysterol liver receptor (LXR), farnesoid X receptor (FXR), and other nuclear 
receptors are involved in cholesterol homeostasis and atherogenesis (Schroepfer, 2000; Haynes et 
al., 2000; Brown and Jessup, 1999) 

[0019] Collectively, the nuclear receptor superfamily is responsive to a wide variety of 

20 ligands. Nuclear hormone receptors share several important structural features, including a variable 
N-terminal region, a conserved central DNA-binding domain, a variable hinge region, and a 
conserved C-terminal ligand-binding domain (Moras and Gronemeyer, 1998; Mangelsdorf et al, 
1995). Despite this conserved structural organization, interactions between ligands and receptors 
are remarkably specific. Hormone binding results in conformational changes in the receptor, 

25 allowing binding to specific DNA sequences (hormone response elements, HREs) in target gene 
promoters resulting in changes in target gene transcription. Interaction of nuclear hormone receptors 
with accessory proteins determines whether the receptor activates or represses transcription. 
Receptors can recruit coactivators that remodel chromatin and stabilize the RNA polymerase 
machinery, or alternatively can interact with factors that condense chromatin structure and inactivate 

30 gene expression (Wolffe et al., 1997). Furthermore, binding of a nuclear hormone receptor to other 
cellular proteins can alter the subcellular localization of the receptor and control its ability to bind 
hormone and HREs (DeFranco et al., 1998). Clearly, identification of factors with which nuclear 
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hormone receptors interact is vital to understanding the process by which hormonal signals are 
transduced into transcriptional responses. In addition, identification of receptor-interacting proteins 
will increase the repertoire of potential targets for therapeutic intervention in the treatment of 
diseases due to defects involving nuclear hormone signaling. ' 
5 [0020] The oxysterol liver X receptor alpha (LXRa) was used in yeast two-hybrid searches 

to identify novel protein-protein interactions. Here, we describe ten new interactors for LXRa, The 
first four interactors are involved in cell adhesion and cellular architecture. The first of these is the 
actin-binding protein utrophin. Utrophin is an autosomal gene that is similar to dystrophin, the gene 
famous for its role in Duchenne's muscular dystrophy. Unlike, dystrophin, however, utrophin 

10 appears to be expressed in a wide variety of adult tissues. Dystrophin and the dystrophin-related 
proteins contain spectrin repeats and likely play a role in anchoring the cytoskeleton to the plasma 
membrane by their actin-binding activities. The second interactor is the adhesion plaque protein 
zyxin, also involved in anchoring the cytoskeleton. Zyxin is a phosphoprotein that contains three 
LM domains and two proline-rich regions. The interaction of LXRa with zyxin is reminiscent of 

1 5 the interaction we have identified between the farnesoid X-activated receptor and the LIM domain 
cytoskeletal protein Paxillin. Third, LXRa interacts with the novel protein PN7771, which is highly 
related (greater than 90% amino acid identity) to Ninein. Ninein is a centrosome-associated protein 
that interacts with human glycogen synthase kinase 3beta (GSK-3beta) (Hong et ah, 2000), is 
localized to the pericentriolar matrix of the centrosome, and reacts with centrosomal autoantibody 

20 sera (Mack et aL, 1998). PN777 1 contains predicted calcium-binding EF hand motifs, a potential 
nuclear localization signal, a basic region-leucine zipper motif, a spectrin repeat, coiled-coil motifs, 
and Glu- and Gin-rich regions. Taken together, these interactions suggest that LXRa may be 
involved in cellular signaling events in response to cellular adhesion or other extracellular stimuli, 
and that the trans- activating ability of LXRa may be regulated by its interaction with these proteins. 

25 [0021] Several LXRa interactors are involved in signal transduction pathways. The first is 

the neuronal immediate early protein homer-3. Homer proteins bind to the C-terminal tails of 
metabotropic glutamate receptors and play a role in their targeting and regulation; the metabotropic 
glutamate receptors, in turn, participate in the influx of intracellular calcium. Since LXR-alpha 
binds to homer-3, it is possible that LXRa may also be involved in calcium release. Alternatively, 

30 LXR-alpha could be modulated in some way by homer-3 in a manner analogous to the way in which 
the metabotropic glutamate receptors are regulated. The second protein, RACK1 (receptor of 
activated protein kinase C 1), is a WD repeat-containing protein that functions as an intracellular 
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receptor to localize PKC to the cytoskeleton. The interaction between RACK1 and LXR-alpha 
suggests that LXR-alpha may be capable of localizing to the cytoskeleton via its association with 
RACK1. The next interactor is the LM-domain protein LIMS1. LIMS1 has been implicated in 
integrin-linked kinase signaling, and it has been shown to interact with the SH3 and SH2 domain- 
5 containing adaptor protein NCK2 (Tu et al. 1998). Taken together, these findings suggest the 
involvement of LXRa in a variety of signal transduction pathways; whether LXRa activity is 
regulated by interaction with these proteins, or vice versa, remains to be determined. 

[0022] Two LXRa interactors involved in protein metabolism were identified: the 
proteasome subunit PSMD1 1, which is involved in protein turnover, and the translation initiation 

10 factor EIF3S 1 , which is involved in protein synthesis. The interaction of these proteins with LXRa 
suggests that nuclear hormone receptors may be involved directly with protein production and 
stability, in addition to transcriptional regulation. 

[0023] An interaction between LXRa and the potential transmembrane protein KIAA0610 
was identified. K1AA0610 is hypothetical protein fragment 686 amino acids in length. Predicted 

15 structural motifs include four possible transmembrane domains and a coiled-coil domain. 
KIAA0610 displays weak homology (-24% amino acid identity over 360-430 residues) to 
Drosophila and C. elegans proteins of unknown function. EST analysis suggests expression in a 
variety of tissues. 

[0024] Finally, and interaction between LXRa and the transcription factor. CLR was 
20 identified. CIR has been demonstrated to interact with the CBF1 transcription factor as well as 
histone deacetylase HD2 and Sin3-associated protein 30kD (Hsieh et al., 1999). It has been 
proposed that CIR acts as a linker between CBF1 and the histone deacetylase complex. Similarly, 
the interaction of LXRa with CIR suggests CIR may link LXRa with the histone deacetylase 
machinery. In support of a functional role between nuclear receptors and CIR, we have also 
25 identified an interaction between CIR and the estrogen receptor ER-beta. 

[0025] The proteins disclosed in the present invention were found to interact with their 
corresponding proteins in the yeast two-hybrid system. Because of the involvement of the corresponding 
proteins in the physiological pathways disclosed herein, the proteins disclosed herein also participate in 
30 the same physiological pathways. Therefore, the present invention provides a list of uses of these proteins 
and DNA encoding these proteins for the development of diagnostic and therapeutic tools useful in the 
physiological pathways. This list includes, but is not limited to, the following examples. 
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Two-Hybrid System 

[0026] The principles and methods of the yeast two-hybrid system have been described in 
detail elsewhere (e.g., Battel and Fields, 1997; Bartel et al., 1993; Fields and Song, 1989; Chevray 
5 and Nathans, 1992). The following is a description of the use of this system to identify proteins that 
interact with a protein of interest. 

[0027] The target protein is expressed in yeast as a fusion to the DNA-binding domain of 
the yeast Gal4p. DNA encoding the target protein or a fragment of this protein is amplified from 
cDNA by PCR or prepared from an available clone. The resulting DNA fragment is cloned by 
10 ligation or recombination into a DNA-binding domain vector (e.g., pGBT9, pGBT.C, pAS2-l) such 
that an in-frame fusion between the Gal4p and target protein sequences is created. 

[0028] The target gene construct is introduced, by transformation, into a haploid yeast strain. 
A library of activation domain fusions (i.e., adult brain cDNA cloned into an activation domain 
vector) is introduced by transformation into a haploid yeast strain of the opposite mating type. The 
15 yeast strain that carries the activation domain constructs contains one or more Gal4p-responsive 
reporter gene(s), whose expression can be monitored. Examples of some yeast reporter strains 
include Y190, PJ69, and CBY14a. An aliquot of yeast carrying the target gene construct is 
combined with an aliquot of yeast carrying the activation domain library. The two yeast strains mate 
to form diploid yeast and are plated on media that selects for expression of one or more Gal4p- 
20 responsive reporter genes. Colonies that arise after incubation are selected for further 
characterization. 

[0029] The activation domain plasmid is isolated from each colony obtained in the two- 
hybrid search. The sequence of the insert in this construct is obtained by the dideoxy nucleotide 
chain termination method. Sequence information is used to identify the gene/protein encoded by the 

25 activation domain insert via analysis of the public nucleotide and protein databases. Interaction of 
the activation domain fusion with the target protein is confirmed by testing for the specificity of the 
interaction. The activation domain construct is co-transformed into a yeast reporter strain with either 
the original target protein construct or a variety of other DNA-binding domain constructs. 
Expression of the reporter genes in the presence of the target protein but not with other test proteins 

30 indicates that the interaction is genuine. 

[0030] In addition to the yeast two-hybrid system, other genetic methodologies are available 
for the discovery or detection of protein-protein interactions. For example, a mammalian two-hybrid 



WO 02/055657 



PCT/US01/48561 



12 

system is available commercially (Clontech, Inc.) that operates on the same principle as the yeast 
two-hybrid system. Instead of transforming a yeast reporter strain, plasmids encoding DNA-binding 
and activation domain fusions are transfected along with an appropriate reporter gene (e.g., lacZ) 
into a mammalian tissue culture cell line. Because transcription factors such as the Saccharomyces 
5 cerevisiae Gal4p are functional in a variety of different eukaryotic cell types, it would be expected 
that a two-hybrid assay could be performed in virtually any cell line of eukaryotic origin (e.g., insect 
cells (SF9), fungal cells, worm cells, etc.). Other genetic systems for the detection of protein-protein 
interactions include the so-called SOS recruitment system (Aronheim et al, 1997). 

10 Protein-protein interactions 

[0031] Protein interactions are detected in various systems including the yeast two-hybrid 
system, affinity chromatography, co-immunoprecipitation, subcellular fractionation and isolation 
of large molecular complexes. Each of these methods is well characterized and can be readily 
performed by one skilled in the art. See, e.g., U.S. Patents No. 5,622,852 and 5,773,218, and PCT 
1 5 published applications No. WO 97/27296 and WO 99/65939, each of which are incorporated herein 
by reference. 

[0032] The protein of interest can be produced in eukaryotic or prokaryotic systems. A 
cDNA encoding the desired protein is introduced in an appropriate expression vector and transfected 
in a host cell (which could be bacteria, yeast cells, insect cells, or mammalian cells). Purification 

20 of the expressed protein is achieved by conventional biochemical and immunochemical methods 
well known to those skilled in the art The purified protein is then used for affinity chromatography 
studies: it is immobilized on a matrix and loaded on a column. Extracts from cultured cells or 
homogenized tissue samples are then loaded on the column in appropriate buffer, and non-binding 
proteins are eluted. After extensive washing, binding proteins or protein complexes are eluted using 

25 various methods such as a gradient of pH or a gradient of salt concentration. Eluted proteins can 
then be separated by two-dimensional gel electrophoresis, eluted from the gel, and identified by 
micro-sequencing. The purified proteins can also be used for affinity chromatography to purify 
interacting proteins disclosed herein. All of these methods are well known to those skilled in the 
art. 

30 [0033] Similarly, both proteins of the complex of interest (or interacting domains thereof) 

can be produced in eukaryotic or prokaryotic systems. The proteins (or interacting domains) can 
be under control of separate promoters or can be produced as a fusion protein. The fusion protein 
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may include a peptide linker between the proteins (or interacting domains) which, in one 
embodiment, serves to promote the interaction of the proteins (or interacting domains). All of these 
methods are also well known to those skilled in the art. 

[0034] Purified proteins of interest, individually or a complex, can also be used to generate 
5 antibodies in rabbit, mouse, rat, chicken, goat, sheep, pig, guinea pig, bovine, and horse. The 
methods used for antibody generation and characterization are well known to those skilled in the 
art. Monoclonal antibodies are also generated by conventional techniques. Single chain antibodies 
are further produced by conventional techniques. 

[0035] DNA molecules encoding proteins of interest can be inserted in the appropriate 

10 expression vector and used for transfection of eukaryotic cells such as bacteria, yeast, insect cells, 
or mammalian cells, following methods well known to those skilled in the art. Transfected cells 
expressing both proteins of interest are then lysed in appropriate conditions, one of the two proteins 
is immunoprecipitated using a specific antibody, and analyzed by polyacrylamide gel 
electrophoresis. The presence of the binding protein (co-immunoprecipitated) is detected by 

1 5 immunoblotting using an antibody directed against the other protein. Co-immunoprecipitation is 
a method well known to those skilled in the art. 

[0036] Transfected eukaryotic cells or biological tissue samples can be homogenized and 
fractionated in appropriate conditions that will separate the different cellular components. Typically, 
cell lysates are run on sucrose gradients, or other materials that will separate cellular components 

20 based on size and density. Subcellular fractions are analyzed for the presence of proteins of interest 
with appropriate antibodies, using immunoblotting or immunoprecipitation methods. These methods 
are all well known to those skilled in the art. 

Disruption of protein-protein interactions 

25 ' [0037] It is conceivable that agents that disrupt protein-protein interactions can be beneficial 
in many physiological disorders, including, but not-limited to NIDDM, AD and others disclosed 
herein. Each of the methods described above for the detection of a positive protein-protein 
interaction can also be used to identify drugs that will disrupt said interaction. As an example, cells 
transfected with DNAs coding for proteins of interest can be treated with various drugs, and co- 

3 0 immunoprecipitations can be performed. Alternatively, a derivative of the yeast two-hybrid system, 
called the reverse yeast two-hybrid system (Leanna and Hannink, 1996), can be used, provided that 
the two proteins interact in the straight yeast two-hybrid system. 
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Modulation of protein-protein interactions 

[0038] Since the interactions described herein are involved in a physiological pathway, the 
identification of agents which are capable of modulating the interactions will provide agents which 

5 can be used to track physiological disorder or to use lead compounds for development of therapeutic 
agents. An agent may modulate expression of the genes of interacting proteins, thus affecting 
interaction of the proteins. Alternatively, the agent may modulate the interaction of the proteins. 
The agent may modulate the interaction of wild-type with wild-type proteins, wild-type with mutant 
proteins, or mutant with mutant proteins. Agents which may be used to modulate the protein 

10 interaction inlcude a peptide, an antibody, a nucleic acid, an antisense compound or a ribozyme. 
The nucleic acid may encode the antibody or the antisense compound. The peptide may be at least 
4 amino acids of the sequence of either of the interacting proteins. Alternatively, the peptide may 
be from 4 to 30 amino acids (or from 8 to 20 amino acids) that is at least 75% identical to a 
contiguous span of amino acids of either of the interacting proteins. The peptide may be covalently 

1 5 linked to a transporter capable of increasing cellular uptake of the peptide. Examples of a suitable 
transporter include penetratins, /-Tat 49 . 57 , ^-Tat 49 , S7 , retro-inverso isomers of /- or </-Tat 49 _ 57 , L- 
arginine oligomers, D- arginine oligomers, L-lysine oligomers, D-lysine oligomers, L-histine 
oligomers, D-histine oligomers, L-ornithine oligomers, D-ornithine oligomers, short peptide 
sequences derived from fibroblast growth factor, Galparan, and HSV-1 structural protein VP22, and 

20 peptoid analogs thereof. Agents can be tested using transfected host cells, cell lines, cell models or 
animals, such as described herein, by techniques well known to those of ordinary skill in the art, such 
as disclosed in U.S. Patents Nos. 5,622,852 and 5,773,218, and PCT published application Nos. WO 
97/27296 and WO 99/65939, each of which are incorporated herein by reference. The modulating 
effect of the agent can be tested in vivo or in vitro. Agents can be provided for testing in a phage 

25 display library or a combinatorial library. Exemplary of a method to screen agents is to measure the 
effect that the agent has on the formation of the protein complex. 

Mutation screening 

[0039] The proteins disclosed in the present invention interact with one or more proteins 
30 known to be involved in a physiological pathway, such as in NTDDM, AD or pathways described 
herein. Mutations in interacting proteins could also be involved in the development of the 
physiological disorder, such as NIDDM, AD or disorders described herein, for example, through 
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a modification of protein-protein interaction, or a modification of enzymatic activity, modification 
of receptor activity, or through an unknown mechanism. Therefore, mutations can be found by 
sequencing the genes for the proteins of interest in patients having the physiological disorder, such 
as insulin, and non-affected controls. A mutation in these genes, especially in that portion of the 
gene involved in protein interactions in the physiological pathway, can be used as a diagnostic tool 
and the mechanistic understanding the mutation provides can help develop a therapeutic tool. 

Screening for at-risk individuals 

[0040] Individuals can be screened to identify those at risk by screening for mutations in the 
protein disclosed herein and identified as described above. Alternatively, individuals can be 
screened by analyzing the ability of the proteins of said individual disclosed herein to form natural 
complexes. Further, individuals can be screened by analyzing the levels of the complexes or 
individual proteins of the complexes or the mKNA encoding the protein members of the complexes. 
Techniques to detect the formation of complexes, including those described above, are known to 
those skilled in the art. Techniques and methods to detect mutations are well known to those skilled 
in the art. Techniques to detect the level of the complexes, proteins or mRNA are well known to 
those skilled in the art. 

Cellular models of Physiological Disorders 

[0041] A number of cellular models of many physiological disorders or diseases have been 
generated. The presence and the use of these models are familiar to those skilled in the art. As an 
example, primary cell cultures or established cell lines can be transfected with expression vectors 
encoding the proteins of interest, either wild-type proteins or mutant proteins. The effect of the 
proteins disclosed herein on parameters relevant to their particular physiological disorder or disease 
can be readily measured. Furthermore, these cellular systems can be used to screen drugs that will 
influence those parameters, and thus be potential therapeutic tools for the particular physiological 
disorder or disease. Alternatively, instead of transfecting the DNA encoding the protein of interest, 
the purified protein of interest can be added to the culture medium of the cells under examination, 
and the relevant parameters measured. 
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Animal models 

[0042] The DNA encoding the protein of interest can be used to create animals that 
overexpress said protein, with wild-type or mutant sequences (such animals are referred to as 
"transgenic"), or animals which do not express the native gene but express the gene of a second 
5 animal (referred to as "transplacement"), or animals that do not express said protein (referred to as 
"knock-out"). The knock-out animal may be an animal in which the gene is knocked out at a 
determined time. The generation of transgenic, transplacement and knock-out animals (normal and 
conditioned) uses methods well known to those skilled in the art. 

[0043] In these animals, parameters relevant to the particular physiological disorder can be 

10 measured. These parametes may include receptor function, protein secretion in vivo or in vitro, 
survival rate of cultured cells, concentration of particular protein in tissue homogenates, signal 
transduction, behavioral analysis, protein synthesis, cell cycle regulation, transport of compounds 
across cell or nuclear membranes, enzyme activity, oxidative stress, production of pathological 
products, and the like. The measurements of biochemical and pathological parameters, and of 

1 5 behavioral parameters, where appropriate, are performed using methods well known to those skilled 
in the art. These transgenic, transplacement and knock-out animals can also be used to screen drugs 
that may influence the biochemical, pathological, and behavioral parameters relevant to the 
particular physiological disorder being studied. Cell lines can also be derived from these animals 
for use as cellular models of the physiological disorder, or in drug screening. 

20 

Rational drug design 

[0044] The goal of rational drug design is to produce structural analogs of biologically 
active polypeptides of interest or of small molecules with which they interact (e.g., agonists, 
antagonists, inhibitors) in order to fashion drugs which are, for example, more active or stable forms 

25 of the polypeptide, or which, e.g., enhance or interfere with the function of a polypeptide in vivo. 
Several approaches for use in rational drug design include analysis of three-dimensional structure, 
alanine scans, molecular modeling and use of anti-id antibodies. These techniques are well known 
to those skilled in the art. Such techniques may include providing atomic coordinates defining a 
three-dimensional structure of a protein complex formed by said first polypeptide and said second 

30 polypeptide, and designing or selecting compounds capable of interfering with the interaction 
between a first polypeptide and a second polypeptide based on said atomic coordinates. 
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[0045] Following identification of a substance which modulates or affects polypeptide 
activity, the substance may be further investigated Furthermore, it may be manufactured and/or used 
in preparation, i.e., manufacture or formulation, or a composition such as a medicament, 
pharmaceutical composition or drug. These may be administered to individuals. 
5 [0046] A substance identified as a modulator of polypeptide function may be peptide or non- 

peptide in nature. Non-peptide "small molecules" are often preferred for many in vivo 
pharmaceutical uses. Accordingly, a mimetic or mimic of the substance (particularly if a peptide) 
may be designed for pharmaceutical use, 

[0047] The designing of mimetics to a known pharmaceutically active compound is a known 

1 0 approach to the development of pharmaceuticals based on a "lead" compound. This approach might 
be desirable where the active compound is difficult or expensive to synthesize or where it is 
unsuitable for a particular method of administration, e.g., pure peptides are unsuitable active agents 
for oral compositions as they tend to be quickly degraded by proteases in the alimentary canal. 
Mimetic design, synthesis and testing is generally used to avoid randomly screening large numbers 

1 5 of molecules for a target property. 

[0048] Once the pharmacophore has been found, its structure is modeled according to its 
physical properties, e.g,, stereochemistry, bonding, size and/or charge, using data from a range of 
sources, e.g., spectroscopic techniques, x-ray diffraction data and NMR. Computational analysis, 
similarity mapping (which models the charge and/or volume of a pharmacophore, rather than the 

20 bonding between atoms) and other techniques can be used in this modeling process. 

[0049] A template molecule is then selected, onto which chemical groups that mimic the 
pharmacophore can be grafted. The template molecule and the chemical groups grafted thereon can 
be conveniently selected so that the mimetic is easy to synthesize, is likely to be pharmacologically 
acceptable, and does not degrade in vivo, while retaining the biological activity of the lead 

25 compound. Alternatively, where the mimetic is peptide-based, further stability can be achieved by 
cyclizing the peptide, increasing its rigidity. The mimetic or mimetics found by this approach can 
then be screened to see whether they have the target property, or to what extent it is exhibited. 
Further optimization or modification can then be carried out to arrive at one or more final mimetics 
for in vivo or clinical testing. 

30 
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Diagnostic Assays 

[0050] The identification of the interactions disclosed herein enables the development of 
diagnostic assays and kits, which can be used to determine a predisposition to or the existence of 
a physiological disorder. In one aspect, one of the proteins of the interaction is used to detect the 
5 presence of a "normal" second protein (Le.-* normal with respect to its ability to interact with the first 
protein) in a cell extract or a biological fluid, and further, if desired, to detect the quantitative level 
of the second protein in the extract or biological fluid. The absence of the "normal" second protein 
would be indicative of a predisposition or existence of the physiological disorder. In a second 
aspect, an antibody against the protein complex is used to detect the presence and/or quantitative 
10 level of the protein complex. The absence of the protein complex would be indicative of a 
predisposition or existence of the physiological disorder. 

Nucleic Acids and Proteins 

[0051] A nucleic acid or fragment thereof has substantial identity with another if, when 

15 optimally aligned (with appropriate nucleotide insertions or deletions) with the other nucleic acid 
(or its complementary strand), there is nucleotide sequence identity in at least about 60% of the 
nucleotide bases, usually at least about 70%, more usually at least about 80%, preferably at least 
about 90%, more preferably at least about 95% of the nucleotide bases, and more preferably at least 
about 98% of the nucleotide bases. A protein or fragment thereof has substantial identity with 

20 another if, optimally aligned, there is an amino acid sequence identity of at least about 30% identity 
with an entire naturally-occurring protein or a portion thereof, usually at least about 70% identity, 
more ususally at least about 80%> identity, preferably at least about 90% identity, more preferably 
at least about 95% identity, and most preferably at least about 98% identity. 

[0052] Identity means the degree of sequence relatedness between two polypeptide or two 

25 polynucleotides sequences as determined by the identity of the match between two strings of such 
sequences. Identity can be readily calculated. While there exist a number of methods to measure 
identity between two polynucleotide or polypeptide sequences, the term "identity" is well known 
to skilled artisans (Computational Molecular Biology, Lesk, A. M., ed,, Oxford University Press, 
New York, 1988; Biocomputing: Informatics and Genome Projects, Smith, D. W., ed., Academic 

30 Press, New York, 1993; Computer Analysis of Sequence Data, Part I, Griffin, A. M., and Griffin, 
H. G., eds., Humana Press, New Jersey, 1994; Sequence Analysis in Molecular Biology, von Heinje, 
G., Academic Press, .1987; and Sequence Analysis Primer, Gribskov, M. and Devereux, I, eds., M 
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Stockton Press, New York, 1991), Methods commonly employed to determine identity between two 
sequences include, but are not limited to those disclosed in Guide to Huge Computers, Martin J. 
Bishop, ed., Academic Press, San Diego, 1994, and Carillo, H., and Lipman, D,, SIAM J Applied 
Math. 48:1073 (1988). Preferred methods to determine identity are designed to give the largest 
5 match between the two sequences tested. Such methods are codified in computer programs. 
Preferred computer program methods to determine identity between two sequences include, but are 
not limited to, GCG (Genetics Computer Group, Madison Wis.) program package (Devereux, I, et 
aL, Nucleic Acids Research 12(1)387 (1984)), BLASTP, BLASTN, FASTA (Altschul et al. (1990); 
Altschul et al. (1997)). The well-known Smith Waterman algorithm may also be used to determine 
10 identity. 

[0053] Alternatively, substantial homology or similarity exists when a nucleic acid or 
fragment thereof will hybridize to another nucleic acid (or a complementary strand thereof) under 
selective hybridization conditions, to a strand, or to its complement. Selectivity of hybridization 
exists when hybridization which is substantially more selective than total lack of specificity occurs. 
15 Nucleic acid hybridization will be affected by such conditions as salt concentration, temperature, 
or organic solvents, in addition to the base composition, length of the complementary strands, and 
the number of nucleotide base mismatches between the hybridizing nucleic acids, as will be readily 
appreciated by those skilled in the art. Stringent temperature conditions will generally include 

temperatures in excess of 30°C, typically in excess of 37°C, and preferably in excess of 45°C. 

20 Stringent salt conditions will ordinarily be less than 1000 mM, typically less than 500 mM, and 
preferably less than 200 mM. However, the combination of parameters is much more important than 
the measure of any single parameter. See, e.g., Asubel, 1992; Wetmur and Davidson, 1968. 

[0054] The terms "isolated", "substantially pure", and "substantially homogeneous" are used 
interchangeably to describe a protein or polypeptide which has been separated from components 

25 which accompany it in its natural state. A monomeric protein is substantially pure when at least 
about 60 to 75% of a sample exhibits a single polypeptide sequence. A substantially pure protein 
will typically comprise about 60 to 90% W/W of a protein sample, more usually about 95%, and 
preferably will be over about 99% pure. Protein purity or homogeneity may be indicated by a 
number of means well known in the art, such as polyacrylamide gel electrophoresis of a protein 

30 sample, followed by visualizing a single polypeptide band upon staining the gel For certain 
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purposes, higher resolution may be provided by using HPLC or other means well known in the art 
which are utilized for purification. 

[0055] Large amounts of the nucleic acids of the present invention may be produced by (a) 
replication in a suitable host or transgenic animals or (b) chemical synthesis using techniques well 
5 known in the art. Constructs prepared for introduction into a prokaryotic or eukaryotic host may 
comprise a replication system recognized by the host, including the intended polynucleotide 
fragment encoding the desired polypeptide, and will preferably also include transcription and 
translational initiation regulatory sequences operably linked to the polypeptide encoding segment. 
Expression vectors may include, for example, an origin of replication or autonomously replicating 

10 sequence (ARS) and expression control sequences, a promoter, an enhancer and necessary 
processing information sites, such as ribosome-binding sites, RNA splice sites, polyadenylation 
sites, transcriptional terminator sequences, and mRNA stabilizing sequences. Secretion signals may 
also be included where appropriate which allow the protein to cross and/or lodge in cell membranes, 
and thus attain its functional topology, or be secreted from the cell. Such vectors may be prepared 

15 by means of standard recombinant techniques well known in the art. 

[0056] The nucleic acid or protein may also be incorporated on a microarray. The 
preparation and use of microarrays are well known in the art. Generally, the microarray may contain 
the entire nucleic acid or protein, or it may contain one or more fragments of the nucleic acid or 
protein. Suitable nucleic acid fragments may include at least 17 nucleotides, at least 21 nucleotides, 

20 at least 30 nucleotides or at least 50 nucleotides of the nucleic acid sequence, particularly the coding 
sequence. Suitable protein fragments may include at least 4 amino acids, at least 8 amino acids, at 
least 12 amino acids, at least 15 amino acids, at least 17 amino acids or at least 20 amino acids. 
Thus, the present invention is also directed to such nucleic acid and protein fragments. 

25 EXAMPLES 

[0057] The present invention is further detailed in the following Examples, which are 
offered by way of illustration and are not intended to limit the invention in any manner. Standard 
techniques well known in the art or the techniques specifically described below are utilized. 
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EXAMPLE 1 
Yeast Two-Hybrid System 
[0058] The principles and methods of the yeast two-hybrid systems have been described in 
detail (Bartel and Fields, 1997). The following is thus a description of the particular procedure that 
5 we used, which was applied to all proteins. 

[0059] The cDNA encoding the bait protein was generated by PCR from brain cDNA. Gene- 
specific primers were synthesized with appropriate tails added at their 5' ends to allow 
recombination into the vector pGBTQ. The tail for the forward primer was 5'- 
gcaggaaacagctatgaccatacagtcagcggccgccacc-3 1 (SEQ ID NO: 1) and the tail for the reverse 

1 0 primer was 5'-acggccagtcgcgtggagtgttatgtcatgcggccgcta-3' (SEQ ID NO:2). The tailed 
PCR product was then introduced by recombination into the yeast expression vector pGBTQ, which 
is a close derivative of pGBTC (Bartel et aL, 1996) in which the polylinker site has been modified 
to include M13 sequencing sites. The new construct was selected directly in the yeast J693 for its 
ability to drive tryptophane synthesis (genotype of this strain: Mat a, ade2, his3, leu2, trpl, 

15 URA3::GALl-lacZ LYS2::GAL1-HIS3 gal4del gal80del cyhR2). In these yeast cells, the bait is 
produced as a Oterminal fusion protein with the DNA binding domain of the transcription factor 
Gal4 (amino acids 1 to 147). A total human brain (37 year-old male Caucasian) cDNA library 
cloned into the yeast expression vector pACT2 was purchased from Clontech (human brain 
MATCHMAKER cDNA, cat. # HL4004AH), transformed into the yeast strain J692 (genotype of 

20 this strain: Mat a, ade2, his3, leu2, trpl, URA3::GALl-lacZ LYS2::GALl-fflS3 gal4del gal80del 
cyhR2), and selected for the ability to drive leucine synthesis. In these yeast cells, each cDNA is 
expressed as a fusion protein with the transcription activation domain of the transcription factor 
Gal4 (amino acids 768 to 881) and a 9 amino acid hemagglutinin epitope tag. J693 cells (Mat a 
type) expressing the bait were then mated with J692 cells (Mat a type) expressing proteins from the 

25 brain library. The resulting diploid yeast cells expressing proteins interacting with the bait protein 
were selected for the ability to synthesize tryptophan, leucine, histidine, and p-galactosidase. DNA 
was prepared from each clone, transformed by electroporation into E. coli strain KC8 (Clontech 
KC8 electrocompetent cells, cat. # C2023-1), and the cells were selected on ampicillin-containing 
plates in the absence of either tryptophane (selection for the bait plasmid) or leucine (selection for 

30 the brain library plasmid). DNA for both plasmids was prepared and sequenced by di- 
deoxynucleotide chain termination method. The identity of the bait cDNA insert was confirmed and 
the cDNA insert from the brain library plasmid was identified using BLAST program against public 
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nucleotides and protein databases. Plasmids from the brain library (preys) were then individually 
transformed into yeast cells together with a plasmid driving the synthesis of lamin fused to the Gal4 
DNA binding domain. Clones that gave a positive signal after p-galactosidase assay were considered 
false-positives and discarded. Plasmids for the remaining clones were transformed into yeast cells 
5 together with plasmid for the original bait. Clones that gave a positive signal after P-galactosidase 
assay were considered true positives. 

EXAMPLE 2 
Identification of LXR-abha/Utrophin Interaction 
10 [0060] A yeast two-hybrid system as described in Example 1 using amino acids 95-277 of 

LXR-alpha (GenBank (GB) accession no. U22662) as bait was performed. One clone that was 
identified by this procedure included amino acids 2443-2650 of utophin (GB accession no. X15488). 



EXAMPLE 3 

15 Identification of LXR-ataha/PN7771 Interaction 

[0061] A yeast two-hybrid system as described in Example 1 using amino acids 95-277 of 
LXR-alpha (GB accession no. U22662) as bait was performed. One clone that was identified by 
this procedure included amino acids 1747-2047 of PN7771 . The DNA sequence and the predicted 
protein sequence for PN7771 are set forth in Tables 1 1 and 12, respectively. 

20 

TABLE 11 

Nucleotide Sequence of PN7771 

cttattttgamcatttacatagtgatogttaacccaacagaccaatcctgggaagacagccagagc 
ataattaggagaagagacctgtccaagaccaggaacctggaccaaaattgtgccatgttgcmactttaatgagtggccccagta 
25 gtatggcagagctgttcacatttatcttctgtgtccacccagttctgctgaaacccctggcaagatcgtggccctgttgtagcttgtcatgtW 

gctgtcMggaaagaaagcaaacacaacctagagcaacattgam 

cgtttttcatgttgtaatgatctgccgtatggacga^^ 

ctgctttttttatagteaaag^ 

catggcgaaggcatcttcagatgtgcaggtttcaggcmcateggaa^^ 

30 gtggcrccagtgctgcagcagacattacttcaggacaacct^^ 

aactctgtcaaatgaagaacactttcaagaaccagactgctcactagaagctcagcccaaatatgttagaggtgggaagcgttacggacga^^ 
ccttgcccgagttccaagagtccgtggaggagtttcctgaagtgacggtgattgagccactggatgaagaagcgcggccttcacacatcccagc 
cggtgactgcagtgagcactggaagacgcaacgcagtgaggagtatgaagcggaaggccagttaaggttttggaacccagatgacttgaatgc 
ttcacagagtggatcttcccctccccaagactggatagaagagaaactgcaagaagmgtgaagatttggggatcacccgtgatggtcacctgaa 

35 ccggaagaagctggtctccatctgtgagcagtatggmacagaatgtggatggagagatgctcgaggaagtattccataatcttgatcctgacggt 

acaaigagtgtagaagattttttctatgg^ 
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catgcagtctttcgatgagagtggacgacgtaccacaacctcatcagcaatgacaagtaccattggctttcgggtcttctcctgcctggatgatggg 
atgggccatgcatctgtggagagaatactggacacctggcaggaagagggcattgagaacagccaggagatcctgaaggccttggatttcagc 
ctcgatggaaacatcaatttgacagaattaacactggcccttgaaaatgaacttttggttaccaagaacagcattcaccaggcggctctggccagct 
ttaaggctgaaatccggcatttgttggaacgagttgatcaggtggtcagagaaaaagagaagctacggtcagatctggacaaggccgagaagct 
5 caagtctttaatggcctcggaggtggatgatcaccatgcggccatagagcggcggaatgagtacaacctcaggaaactggatggagagtacaa 
ggagcgaatagcagccttaaaaaatgaactccgaaaagagagagagcagatcctgcagcaggcaggcaagcagcgtttagaacttgaacagg 
aaattgaaaaggcaaaaacagaagagaactatetccgggaccgccttgcccte^^ 

aatgcagagaagttggcagaatatgagaatctgacaaacaaacttcagagaaatttggaaaatgtgttagcagaaaagtttggtgacctcgatcct 

agcagtgctgagttcttcctgcaagaagagagactgacacagatgagaa^ 
10 aactccagtctgagctggaagaatatcgtgcacaaggcagagtgctcaggcttccgttgaagaactcaccgtcagaagaagttgaggctaacag 

cggtggcattgagcccgaacacgggctcggttctgaagaatgcaatccattgaatatgagcattgaggcagagctggtcattgaacagatgaaag 

aacaacatcacagggacatatgttgcctcagactggagctcgaagataaagtgcgccattatgaaaagcagctggacgaaaccgtggtc 

caagaaggcacaggagaacatgaagcaaaggcatgagaacgaaacgcgcaccttagaaaaacaaataagtgaccttaaaaatgaaattgctg 

acttcaggggcaagcagcagtgctcaaggaggcacatcatgaggccacttgcaggcatgaggaggagaaaaaacaactgcaagtgaagcttg 
15 aggaggaaaagactcacctgcaggagaagctgaggctgcaacatgagatggagctcaaggctagactgacacaggctcaagcaagctttgag 

cgggagagggaaggccttcagagtagcgcctggacagaagagaaggtgagaggcttgactcaggaactagagcagtttcaccaggagcagc 

tgacaagcctggtggagaaacacactcttgagaaagaggagttaagaaaagagctcttggaaaagcaccaaagggagcttcaggagggaagg 

gaaaaaatggaaacagagtgtaatagaagaacctctcaaatagaagcccagtte^ 

tctgcaaagcctggaggggcgctaccgccaagagctgaagga^ 
20 gacgagctcacccaggagtgtgcggaagcccaggagctgctgaaagagactcttaagagagagaaaacaacttctctggtcctgacccagga 

gagagagatgctggagaaaacatacaaagaacatttgaacagcatggtcgtcgagagacagcagctactccaagacctggaagacctaagaaa 

tgtatctgaaacccagcaaagcctgctgtctgaccagatacttgagctgaagagcagtcacaaaagggaactgagggagcgtgaggaggtcct 

gtgccaggcaggggcttcggagcagctggccagccagcggctggaaagactagaaatggaacatgaccaggaaaggcaggaaatgatgtcc 

aagcttctagccatggagaacattcacaaagcgacctgtgag 
25 agtaaaataaaggaaatgcagcaggcaacatctcctctctcaatgcttcagagtggttgccaggtgataggagaggaggaggtggaaggagat^ 

gagccctgtccctgcttcagcaaggggagcagctgttggaagaaaatggggacgtcctcttaagcctgcagagagctcatgaacaggcagtga 

aggaaaatgtgaaaatggctactgaaatttctagattgcaacagaggctaca 

gctactgagttttttggaaatactgcggaacaaacagagcagtttttacagcaaaaccgaacgaag 

cctaagtgacctggaagatgatgaggtccgggacctgggaagtaca^ 
30 agcttcagtagagggtttttctgagcttgaaaacagtgaagagacc^^ 

gctaatgatgttatgtgcggactgtgatcgagcttctgaaaaga 

gagaatccctgaggcttctcccaaatataagctgttgtatg^^ 

cacgctacgatgaggcactagaaaataacaaagaactcactgcagaggttttcaggttgcaggatgagctgaagaaaatggaggaagtca 
aacattcctcagcctggaaaagagttacgatgaggtcaaaatagaaaatgaggggctgaatgttctggMgagacttcaaggcaagattgag 
35 gcttcaggaaagcgtggtccagcggtgtgactgctgc^ 

tcaatcagacactggaagagtgtgtgcccagggttaggagtgtacatcatgtcatagaggaatgtaagcaagaaaaccagtaccttg 

cacacagctcttggaaaaagtaaaagcacatgaaattgcctggttacatgg 

gttatactggaggaaaacactactctcctaggctttcaagacaaac^ 

ttacaggagctgactaggaagttgaaggagagagtcactattttagttaagcaaaaagatgtactttctcacggagaaaaggagg 
40 ggcaatgatgcatgacttgcagatcacgtgcagtgagatgcagc^ 
tatfflgagaaatgaaattactactttaaatgaagaagatagcatttrt^ 
aacggaaactgtaaaacaagaaaatgctgcagttcagaagatggttgaaaamaa^ 

ggatttggaaaatacagaacttagccaaaagaactctcaaaaccaggaaaaactgcaagaacttaatcaacgtctaacagaaatgctatgc 
aggaaaaagagccaggaaacagtgcattggaggaacgggaacaagagaagtttaatct^ 
45 ccactttagtgtcttctctggaggcggagctcte^^ 

gaaaatgaaacagctgcacagatgtcccgatctctctgacttc^^ 

ggaagctctgagtgaggaattaaatagctgtgtcgataagttggcaaaatcaagtcttttagagcatagaattgcgacgatgaagcagg 
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atcctgggaacatcagagtgcgagcttaaagtcacagctggtggcttctcaggaaaaggttcagaatttagaagacaccgtgcagaatgtaaacc 
tgcaaatgtcccggatgaaatctgacctacgagtgactcagcaggaaaaggaggctttaaaacaagaagtgatgtctttacataagcaacttcaga 
atgctggtggcaagagctgggccccagagatagctactcatccatcagggctccataaccagcagaaaaggctgtcctgggacaagttggatca 
tctgatgaatgaggaacagcagctgctttggcaagagaatgagaggctccagaccatggtacagaacaccaaagccgaactcacgcactcccg 
5 ggagaaggtccgtcaattggaatccaatcttcttcccaagcaccaaaaacatctaaacccatcaggtaccatgaatcccacagagcaagaaaaatt 
gagcttaaagagagagtgtgatcagtttcagaaagaacaatctcctgctaacaggaaggtcagtcagatgaattcccttgaacaagaattagaaac 
aattcatttggaaaatgaaggcctgaaaaagaaacaagtaaaactggatga 
agcccgtcccctcatgcttgggatttgcagctgctccagc^^ 

gctgcaggcagaaaggataaaccagcacctgcaggaggaacttgaaaacaggacctccgaaaccaacacaccacagggaaaccaggaaca 
10 actggtaactgtcatggaggaacgaatgatagaagttgaacagaaactgaaactagtgaaaaggcttcttcaagagaaagtgaatc 
aacaactctgcaagaacactaaggcagacgcaatggtgaaggacttg^ 

cgacagaaaacagcagagaagaaaaattacctcctggaggagaagattgccagcctcagtaatatagttaggaatctgacaccagcgccattga 
cttctacacctcctttgaggtcatagccaaaccaaagggta^ 
aattaagcctaacctaaaactgccagcaacacaactggagtttccaW^ 
15 tcacatatttcactacttaaattattcccaagatttgaatttat^ 
ataccagtffigaatgagtttttgtg^ 
taaactttattttaaaaccaaatttaggtgttcttacatattta^ 

taattggttaggaggcagggcttattaagtggttattaaccgctgacatcagacaaacccaaatctgtagaattctaacctc^ 

tattaccactcttcttgtattatagatttagaactgatttactcaattgcactcttaactaatgttaaaagcttactt^ 
20 aaaagtttcatttggggagctggtcttctaagaaacggata^ 

ggagtaaatagctactcttagaaaagagggataagcagaccatgtaggttttctgtctctcaaatctogagtt 

gaactcagggaacaatactgtaaactgtcttcctgaactactgtagggcctctctaagaatttgaaatgtataaaccatgtgacctcato 

tatatttacagccatactagaattmatttctacgtttttagtaaatttaatattctgggggaaaaaaggcctt 

aagagtttatttaatafoggtcaaaattttctgt^ 
25 ctgaggtcaaggagtgccaaataggactttccactcatgctc^ 

aattataatctctgcttctagccacttccgccagcagttg^ 

catgagacaataatccgaaaagttcgctttgatatattcctggagggccaagcccatctatttacaaaaggtgaacagcaaaatcaagcactgcW 
atgggcaggaacacaagagaaagcaaactgcccaagaagtcatcatgtcagaaactcaatctcaacaaaataatttccatcag 
gmcttgggggcttatgagtctcaccggtcaacccaggaggcctcactacaagagccttgacaaggcactgttt^ 
30 ctgatgaagcaaacctttgaattmgcacagctct^ 
caaaatafflgacatctgctattatgccttctttagatcm^ 
aaaattgtttcacttaaaactgtggattggcctaggctaaggacaaa^ 
gttattgctgttggaactgaacaataatatttccca^ 

gtgtaggttaatttgtttatttcctataaatttgtatttatgtgtatataaaatgtacaatgaatgtaaatatgact^ 
35 ctctattcaaaatcaaaatgctgctcaaatgaatttaaccaacatctaggtgcttaatttctcattttatcc 
gagaaataccatacagataccttaaatgtatgcatttgtgcaacaat^ 
tatagcttttaaaagactttttaaagacattaaatgta^ 

tttataggcamagttgcttattaaaagcactgatWcaaacttmgatttaagaacaattatttaa 
aaagggaatcaagtttgcctttgagataatacgttacactaaga 
40 gctttttagaagcctgcacttaagcttagattt^ 
aaatccagattttttaaactgttttaaatgte^ 
cataggcttccaaggcataggaagagatcttgcaggtctagt^^ 
gggtttatggcagacmgcttmtaacatgtgagaaatgaattttttatmgtgatto^ 
tattgtcagcatcttaaagg^ 

45 attggaatgatataattatacaagtaatgccaaaaaccaagtcaaagcctaattaaccaaagcactcatttaaaaatcatcatgW^ 
acctctcagcactgtaaaatagttttggttttg^ 
ttaagttcttaatggggagacattatcatggcatgacttaagg 
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agataaagattaatgagttctgtgtttatatcagctttgtatatttcatcttagccattctatcctagaaagattttaatgtgagcttaagatgtaaataaata 
attttgcaaacatgaaaaaaaaaaaaaaaaa (SEQ ID N0:3) 

TABLE 12 
Predicted Amino Acid Sequence of PN7771 

MAEVTWRVYVWGfflCMAKASSDVQVSGFHRKIQHVKNELCHMLSLEEVAPVLQQTLL 

QDNLLGRVHFDQFKEALILILSRTLSNEEHFQEPDCSLEAQPKYVRGGKRYGRRSLPEFQES 

VEEFPEVTVIEPLDEEAJRPSHPAGDCSEHWKTQRSEEYEAEGQLRFWNPDDLNASQSGSSP 

PQDWIEEKLQEVCEDLGITRDGHLKRKXLVSICEQYGLQNVDGEMLEEWHNLDPDGTMS 

VEDFFYGLFKNGKSLTPSASTPYRQLKRHLSMQSFDESGRRTTTSSAMTSTIGFRVFSCLDD 

GMGHASVERILDTWQEEGffiNSQEILKALDFSLDGNINLTELTLALENELLVTKNSIHQAAL 

ASFKAEIRIILLERVDQVVREKEKLRSDLDKAEKLKSLMASEVDDHHAAIER 

DGEYKERIAALKNELRKEREQILQQAGKQRLELEQEIEKAKTEENYIRDRLALSLKENSRLE 

^LLENAEKLAEYENLTNKLQRNLENVLAEKFGDLDPSSAEFFLQEEPJLTQMRNEYERQCR 

VLQDQVDELQSELEEYRAQGRVLRLPLKNSPSEEVEANSGGffiPEHGLGSEECNPLNMSIEA 

ELVIEQMKEQHHRDICCLRLELEDKVRHYEKQLDETVVSCKKAQENMKQRHENETRTLEK 

QISDLKNEIAELQGQAAVLKEAHHEATCRHEEEKXQLQVKLEEEKTHLQEKLRLQHEMEL 

KARLTQAQASFEREREGLQSSAWTEEKVRGLTQELEQFHQEQLTSLVEKHTLEKEELRKEL 

LEKHQRELQEGREKMETECNRRTSQffiAQFQSDCQKVTERCESALQSLEGRYRQELKDLQE 

QQREEKSQWEFEKT)ELTQECAEAQELLKETLK31EKTTSLVLTQEREMLEKTYKEHLNSMV 

VERQQLLQDLEDLRNVSETQQSLLSDQILELKSSHKRELREREEVLCQAGASEQLASQRLER 

LEMEHDQERQEMMSKLLAMENIHKATCETADRERAEMSTEISRLQSKIKEMQQATSPLSM 

LQSGCQVIGEEEVEGDGALSLLQQGEQLLEENGDVLLSLQRAHEQAVKENVKMATEISRLQ 

QRLQKLEPGLVMSSCLDEPATEFFGNTAEQTEQFLQQNRTKQVEGVTRRHVLSDLEDDEVR 

DLGSTGTSSVQRQEWIEESEASVEGFSELENSEETRTESWELKNQISQLQEQLMMLCADCD 

RASEKK.QDLLFDVSVLKKKLKMLERIPEASPKYKLLYEDVSRENDCLQEELRMMETRYDE 

ALENNKELTAEVFRLQDELKXMEEVTETFLSLEKSYDEVmNEGLNVLVLRLQGmKLQE 

SWQRCDCCLWEASLENLEIEPDGNILQLNQTLEECVPRVRSVHHVIEECKQENQYLEGNT 

QLLEKVKAHEIAWLHGTIQTHQERPRVQNQmEENTTLLGFQDKHFQHQATIAELELEKTK 

LQELTRKLKERVTILVKQKDVLSHGEKEEELKAMMHDLQITCSEMQQKVELLRYESEKLQ 

QENSILRNEITTLNEEDSISNLKLGTLNGSQEEMWQKTETVKQENAAVQKMVENLKKQISE 

LJmOSfQQLDLENTELSQKNSQNQEKLQELNQRLTEMLCQKEKEPGNSALEEREQEKFNLKI 

ELERCKVQSSTLVSSLEAELSEVKIQTHWQQENHLLKDELEKMKQLHRCPDLSDFQQKISS 

VLSYOTKLLKEKEALSEELNSCVDKIJ^SLLEHPJATMKQEQKSWEHQSASLKSQLVASQ 

EKVQNLEDTVQNVNLQMSRMKSDLRVTQQEKEALKQEVMSLHKQLQNAGGKSWAPEIAT 

HPSGLHNQQKRLSWDKLDHLMNEEQQLLWQENERLQTMVQNTKAELTHSREKVRQLESN 

LLPKHQKHLWSGTMNPTEQEKLSLKRECDQFQKEQSPANRKVSQMNSLEQELETJHLENE 

GLKKKQVKLDEQLMEMQHLRSTATPSPSPHAWDLQLLQQQACPMVPREQFLQLQRQLLQ 

AERINQHLQEELENRTSETNTPQGNQEQLVTVMEERMffiVEQKLKLVKRLLQEKVNQLKEQ 

LCKNTKADAMVKDLWENAQLLKjVLEVTEQ 

TSTPPLRS (SEQ ED NO:4) 
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EXAMPLES 4-12 
Identification of Protein-Protein Interactions 
[0062] A yeast two-hybrid system as described in Example 1 using amino acids of the bait 
as set forth in Table 13 was performed. The clone that was identified by this procedure for each bait 
5 is set forth in Table 13 as the prey. The "AA" refers to the amino acids of the bait or prey. The 
££ NUC" refers to the nucleotides of the bait or prey. The Accession numbers refer to GB: GenBank 
accession numbers. 
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EXAMPLE 13 

Generation of Polyclonal Antibody Against Protein Complexes 
[0063] As shown above, LXR-alpha interacts with utrophin to form a complex. A complex 
of the two proteins is prepared, e.g., by mixing purified preparations of each of the two proteins. 
5 If desired, the protein complex can be stabilized by cross-linking the proteins in the complex, by 
methods known to those of skill in the art. The protein complex is used to immunize rabbits and 
mice using a procedure similar to that described by Harlow et al. (1988). This procedure has been 
shown to generate Abs against various other proteins (for example, see Kraemer et al., 1 993). 

[0064] Briefly, purified protein complex is used as immunogen in rabbits. Rabbits are 
10 immunized with 100 |ig of the protein in complete Freund f s adjuvant and boosted twice in three- 
week intervals, first with 100 jag of immunogen in incomplete Freund's adjuvant, and followed by 
100|igofimmunogeninPBS. Antibody-containing serum is collected two weeks thereafter. The 
antisera is preadsorbed with LXR-alpha and utrophin, such that the remaining antisera comprises 
antibodies which bind conformational epitopes, i.e., complex-specific epitopes, present on the LXR- 
1 5 alpha-utrophin complex but not on the monomers. 

[0065] Polyclonal antibodies against each of the complexes set forth in Tables 1-10 are 
prepared in a similar manner by mixing the specified proteins together, immunizing an animal and 
isolating antibodies specific for the protein complex, but not for the individual proteins. 

[0066] Polyclonal antibodies against the protein set forth in Table 12 are prepared in a 
20 similar manner by immunizing an animal with the protein and isolating antibodies specific for the 
protein. 

EXAMPLE 14 

Generation of Monoclonal Antibodies Specific for Protein Complexes 
25 [0067] Monoclonal antibodies are generated according to the following protocol. Mice are 

immunized with immunogen comprising LXR-alpha/utrophin complexes conjugated to keyhole 
limpet hemocyanin using glutaraldehyde or EDC as is well known in the art. The complexes can 
be prepared as described in Example 13, and may also be stabilized by cross-linking. The 
immunogen is mixed with an adjuvant. Each mouse receives four injections of 10 to 100 \ig of 
30 immunogen, and after the fourth injection blood samples are taken from the mice to determine if the 
serum contains antibody to the immunogen. Serum titer is determined by ELISA or RIA. Mice 
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with sera indicating the presence of antibody to the immunogen are selected for hybridoma 
production. 

[0068] Spleens are removed from immune mice and a single-cell suspension is prepared 
(Harlow et al., 1988). Cell fusions are performed essentially as described by Kohler et al. (1975). 
5 Briefly, P3.65.3 myeloma cells (American Type Culture Collection, Rockville, MD) or NS-1 
myeloma cells are fused with immune spleen cells using polyethylene glycol as described by Harlow 
et al. (1988). Cells are plated at a density of 2xl0 5 cells/well in 96-well tissue culture plates. 
Individual wells are examined for growth, and the supernatants of wells with growth are tested for 
the presence of LXR-alpha/utrophin complex-specific antibodies by ELISA or RIA using LXR- 
10 alpha/utrophin complex as target protein. Cells in positive wells are expanded and subcloned to 
establish and confirm monoclonality. 

[0069] Clones with the desired specificities are expanded and grown as ascites in mice or 
in a hollow fiber system to produce sufficient quantities of antibodies for characterization and assay 
development. Antibodies are tested for binding to LXR-alpha alone or to utrophin alone, to 
1 5 determine which are specific for the LXR-alpha/utrophin complex as opposed to those that bind to 
the individual proteins. 

[0070] Monoclonal antibodies against each of the complexes set forth in Tables 1-10 are 
prepared in a similar manner by mixing the specified proteins together, immunizing an animal, 
fusing spleen cells with myeloma cells and isolating clones which produce antibodies specific for 
20 the protein complex, but not for the individual proteins. 

[0071] Monoclonal antibodies against the protein set forth in Table 12 are prepared in a 
similar manner by immunizing an animal with the protein, fusing spleen cells with myeloma cells 
and isolating clones which produce antibodies specific for the protein. 

25 EXAMPLE 15 

In vitro Identification of Modulators for Protein-Protein Interactions 
[0072] The present invention is useful in screening for agents that modulate the interaction 
of LXR-alpha and utrophin. The knowledge that LXR-alpha and utrophin form a complex is useful 
in designing such assays. Candidate agents are screened by mixing LXR-alpha and utrophin (a) in 
30 the presence of a candidate agent, and (b) in the absence of the candidate agent. The amount of 
complex formed is measured for each sample. An agent modulates the interaction of LXR-alpha 
and utrophin if the amount of complex formed in the presence of the agent is greater than 
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(promoting the interaction), or less than (inhibiting the interaction) the amount of complex formed 
in the absence of the agent. The amount of complex is measured by a binding assay, which shows 
the formation of the complex, or by using antibodies immunoreactive to the complex, 

[0073] Briefly, a binding assay is performed in which immobilized LXR-alpha is used to 
5 bind labeled utrophin. The labeled utrophin is contacted with the immobilized LXR-alpha under 
aqueous conditions that permit specific binding of the two proteins to form a LXR-alpha/utrophin 
complex in the absence of an added test agent. Particular aqueous conditions may be selected 
according to conventional methods. Any reaction condition can be used as long as specific binding 
of LXR-alpha/utrophin occurs in the control reaction. A parallel binding assay is performed in 

10 which the test agent is added to the reaction mixture. The amount of labeled utrophin bound to the 
immobilized LXR-alpha is determined for the reactions in the absence or presence of the test agent. 
If the amount of bound, labeled utrophin in the presence of the test agent is different than the 
amount of bound labeled utrophin in the absence of the test agent, the test agent is a modulator of 
the interaction of LXR-alpha and utrophin. 

1 5 [0074] Candidate agents for modulating the interaction of each of the protein complexes set 

forth in Tables 1-10 are screened in vitro in a similar manner. 

EXAMPLE 16 

In vivo Identification of Modulators for Protein-Protein Interactions 
20 [0075] In addition to the in vitro method described in Example 15, an in vivo assay can also 

be used to screen for agents which modulate the interaction of LXR-alpha and utrophin. Briefly, 
a yeast two-hybrid system is used in which the yeast cells express (1) a first fusion protein 
comprising LXR-alpha or a fragment thereof and a first transcriptional regulatory protein sequence, 
e.g., GAL4 activation domain, (2) a second fusion protein comprising utrophin or a fragment thereof 
25 and a second transcriptional regulatory protein sequence, e.g., GAL4 DNA-binding domain, and (3) 
a reporter gene, e.g., p-galactosidase, which is transcribed when an intermolecular complex 
comprising the first fusion protein and the second fusion protein is formed. Parallel reactions are 
performed in the absence of a test agent as the control and in the presence of the test agent. A 
functional LXR-alpha/utrophin complex is detected by detecting the amount of reporter gene 
30 expressed. If the amount of reporter gene expression in the presence of the test agent is different 
than the amount of reporter gene expression in the absence of the test agent, the test agent is a 
modulator of the interaction of LXR-alpha and utrophin. 
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[0076] Candidate agents for modulating the interaction of each of the protein complexes set 
forth in Tables 1-10 are screened in vivo in a similar manner, 

[0077] While the invention has been disclosed in this patent application by reference to the 
5 details of preferred embodiments of the invention, it is to be understood that the disclosure is 
intended in an illustrative rather than in a limiting sense, as it is contemplated that modifications will 
readily occur to those skilled in the art, within the spirit of the invention and the scope of the 
appended claims. 
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WHAT IS CLAIMED IS : 

1 . An isolated protein complex comprising two proteins, the protein complex selected from the 
group consisting of: 

(i) a complex of a first protein and a second protein; 

(ii) a complex of a fragment of said first protein and said second protein; 

(iii) a complex of said first protein and a fragment of said second protein; and 

(iv) a complex of a fragment of said first protein and a fragment of said second 
protein, wherein said first protein is LXR-alpha and said second protein is selected from 
the group consisting of utrophin, zyxin, LMS1, PN7771, Homer-3, RACK1, EIF3S1, 
PSMD11, KIAA0610 and CIR. 

2. The protein complex of claim 1, wherein said protein complex comprises said first protein 
and said second protein. 

3 . The protein complex of claim 1 , wherein said protein complex comprises a fragment of said 
first protein and said second protein or said first protein and a fragment of said second 
protein. 

4. The protein complex of claim 1 , wherein said protein complex comprises fragments of said 
first protein and said second protein. 

5. An isolated antibody selectively immunoreactive with a protein complex of claim 1 . 

6. The antibody of claim 5, wherein said antibody is a monoclonal antibody. 

7. A method for diagnosing a physiological disorder in an animal, which comprises assaying 
for: 

(a) whether a protein complex set forth in claim 1 is present in a tissue extract; 

(b) the ability of proteins to form a protein complex set forth in claim 1; and 

(c) a mutation in a gene encoding a protein of a protein complex set forth in claim 
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8, The method of claim 7, wherein said animal is a human. 

9. The method of claim 8, wherein said physiological disorder is selected from the group 
5 consisting of disorders associated with cholesterol homeostasis and atherogenesis. 

10. The method of claim 7, wherein the diagnosis is for a predisposition to said physiological 
disorder. 

10 11. The method of claim 7, wherein the diagnosis is for the existence of said physiological 
disorder. 

12. The method of claim 7, wherein said physiological disorder is selected from the group 
consisting of disorders associated with cholesterol homeostasis and atherogenesis. 

15 

13. The method of claim 7, wherein said assay comprises a yeast two-hybrid assay. 

14. The method of claim 7, wherein said assay comprises measuring in vitro a complex formed 
by combining the proteins of the protein complex, said proteins isolated from said animal. 

20 

15. The method of claim 14, wherein said complex is measured by binding with an antibody 
specific for said complex. 

16. The method of claim 7, wherein said assay comprises mixing an antibody specific for said 
25 protein complex with a tissue extract from said animal and measuring the binding of said 

antibody. 

17. A method for determining whether a mutation in a gene encoding one of the proteins of a 
protein complex set forth in claim 1 is useful for diagnosing a physiological disorder, which 

30 comprises assaying for the ability of said protein with said mutation to form a complex with 

the other, protein of said protein complex, wherein an inability to form said complex is 
indicative of said mutation being useful for diagnosing a physiological disorder. 
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18. The method of claim 17, wherein said gene is an animal gene. 

1 9 . The method of claim 1 8, wherein said animal is a human. 

5 

20. The method of claim 19, wherein said physiological disorder is selected from the group 
consisting of disorders associated with cholesterol homeostasis and atherogenesis. 

21. The method of claim 17, wherein the diagnosis is for a predisposition to a physiological 
10 disorder. 

22. The method of claim 17, wherein the diagnosis is for the existence of a physiological 
disorder. 

15 23. The method of claim 17, wherein said assay comprises a yeast two-hybrid assay. 

24, The method of claim 17, wherein said assay comprises measuring in vitro a complex formed 
by combining the proteins of the protein complex, said proteins isolated from an animal. 

20 25. The method of claim 24, wherein said animal is a human. 

26. The method of claim 24, wherein said complex is measured by binding with an antibody 
specific for said complex. 

25 27 . A non-human animal model for a physiological disorder wherein the genome of said animal 
or an ancestor thereof has been modified such that the formation of a protein complex set 
forth in claim 1 has been altered. 



28. 

30 



The non-human animal model of claim 27, wherein said physiological disorder is selected 
from the group consisting of .disorders associated with cholesterol homeostasis and 
atherogenesis. 



10 
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29. The non-human animal model of claim 27, wherein the formation of said protein complex 
has been altered as a result of: 

(a) over-expression of at least one of the proteins of said protein complex; 

(b) replacement of a gene for at least one of the proteins of said protein complex with 
a gene from a second animal and expression of said protein; 

(c) expression of a mutant form of at least one of the proteins of said protein 
complex; 

(d) a lack of expression of at least one of the proteins of said protein complex; or 

(e) reduced expression of at least one of the proteins of said protein complex. 

30. A cell line obtained from the animal model of claim 27. 



31. A non-human animal model for a physiological disorder, wherein the biological activity of 
a protein complex set forth in claim 1 has been altered. 

15 

32. The non-human animal model of claim 3 1 , wherein said physiological disorder is selected 
from the group consisting of disorders associated with cholesterol homeostasis and 
atherogenesis, 

20 33. The non-human animal model of claim 31, wherein said biological activity has been altered 
as a result of: 

(a) disrupting the formation of said complex; or 

(b) disrupting the action of said complex. 

25 34. The non-human animal model of claim 31, wherein the formation of said complex is 
disrupted by binding an antibody to at least one of the proteins which form said protein 
complex. 



35. 

30 



The non-human animal model of claim 31, wherein the action of said complex is disrupted 
by binding an antibody to said complex. 
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36. The non-human animal model of claim 31, wherein the formation of said complex is 
disrupted by binding a small molecule to at least one of the proteins which form said protein 
complex. 

5 37. The non-human animal model of claim 31, wherein the action of said complex is disrupted 
by binding a small molecule to said complex. 

38. A cell in which the genome of cells of said cell line has been modified to produce at least 
one protein complex set forth in claim 1 . 

39. A cell line in which the genome of the cells of said cell line has been modified to eliminate 
at least one protein of a protein complex set forth in claim 1 . 

40. A composition comprising: 

15 a first expression vector having a nucleic acid encoding a first protein or a 

homologue or derivative or fragment thereof; and 

a second expression vector having a nucleic acid encoding a second protein, or a 
homologue or derivative or fragment thereof, wherein said first and said second proteins are 
the proteins of claim 1. 

20 

41. A host cell comprising: 

a first expression vector having a nucleic acid encoding a first protein which is first 
protein or a homologue or derivative or fragment thereof; and 

a second expression vector having a nucleic acid encoding a second protein which 
25 is second protein, or a homologue or derivative or fragment thereof thereof, wherein said 

first and said second proteins are the proteins of claim 1. 

42. The host cell of claim 41 , wherein said host cell is a yeast cell. 

30 43. The host cell of claim 41, wherein said first and second proteins are expressed in fusion 
proteins. 
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44. The host cell of claim 41, wherein one of said first and second nucleic acids is linked to a 
nucleic acid encoding a DNA binding domain, and the other of said first and second nucleic 
acids is linked to a nucleic acid encoding a transcription-activation domain, whereby two 
fusion proteins can be produced in said host cell. 

5 

45 . The host cell of claim 4 1 , further comprising a reporter gene, wherein the expression of the 
reporter gene is determined by the interaction between the first protein and the second 
protein. 

10 46. A method for screening for drug candidates capable of modulating the interaction of the 
proteins of a protein complex, the protein complex selected from the group consisting of the 
protein complexes of claim 1, said method comprising 

(i) combining the proteins of said protein complex in the presence of a drug to form 
a first complex; 

15 (ii) combining the proteins in the absence of said drug to form a second complex; 

(iii) measuring the amount of said first complex and said second complex; and 

(iv) comparing the amount of said first complex with the amount of said second 
complex, 

wherein if the amount of said first complex is greater than, or less than the amount of said 
20 second complex, then the drug is a drug candidate for modulating the interaction of the 

proteins of said protein complex. 

47. The method of claim 46, wherein said screening is an in vitro screening. 

25 48. The method of claim 46, wherein said complex is measured by binding with an antibody 
specific for said protein complexes. 



30 



49. 



The method of claim 46, wherein if the amount of said first complex is greater than the 
amount of said second complex, then said drug is a drug candidate for promoting the 
interaction of said proteins. 



WO 02/055657 



PCTYUSO 1/48561 



39 

• 50. The method of claim 46, wherein if the amount of said first complex is less than the amount 
of said second complex, then said drug is a drug candidate for inhibiting the interaction of 
said proteins. 

5 51. A drug useful for treating a physiological disorder identified by the method of claim 46. 

52. The drug of claim 51, wherein said physiological disorder is selected from the group 
consisting of disorders associated with cholesterol homeostasis and atherogenesis. 

10 53. A method of screening for drug candidates useful in treating a physiological disorder which 
comprises the steps of: 

(a) measuring the activity of a protein selected from the goup consisting of a first 
protein and a second protein in the presence of a drug, wherein said first and second proteins 
are selected from the group consisting of the proteins of claim 1, 
15 (b) measuring the activity of said protein in the absence of said drug, and 

(c) comparing the activity measured in steps (1) and (2), 
wherein if there is a difference in activity, then said drug is a drug candidate for treating said 
physiological disorder. 

20 54. A drug useful for treating a physiological disorder identified by the method of claim 53. 

55. The drug of claim 54, wherein said physiological disorder is selected from the group 
consisting of disorders associated with cholesterol homeostasis and atherogenesis. 

25 56. A method for selecting modulators of a protein complex formed between a first protein or 
a homologue or derivative or fragment thereof and a second protein or a homologue or 
derivative or fragment thereof, wherein said first and second proteins are selected from the 
group consisting of the proteins of claim 1, said method comprising: 
providing the protein complex; 
30 contacting said protein complex with a test compound; and 

determining the presence or absence of binding of said test compound to said protein 
complex. 
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57. A modulator useful for treating a physiological disorder identified by the method of claim 
56. 

5 58. The modulator of claim 57, wherein said physiological disorder is selected from the group 
consisting of disorders associated with cholesterol homeostasis and atherogenesis. 

59. A method for selecting modulators of an interaction between a first protein and a second 
protein, said first protein or ahomologue or derivative or fragment thereof and said second 

10 protein or a homologue or derivative or fragment thereof, wherein said first and second 

proteins are selected from the group consisting of the proteins of claim 1, said method 
comprising: 

contacting said first protein with said second protein in the presence of a test 
compound; and 

1 5 determining the interaction between said first protein and said second protein. 

60. The method of claim 59, wherein at least one of said first and second proteins is a fusion 
protein having a detectable tag. 

20 61. The method of claim 59, wherein said step of determining the interaction between said first 
protein and said second protein is conducted in a substantially cell free environment. 

62. The method of claim 59, wherein the interaction between said first protein and said second 
protein is determined in a host cell 

25 

63. The method of claim 62, wherein said host cell is a yeast cell. 

64. The method of claim 59, wherein said test compound is provided in a phage display library. 



30 65. 



The method of claim 59, wherein said test compound is provided in a combinatorial library. 
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66. A modulator useful for treating a physiological disorder identified by the method of claim 
59. 



67. The modulator of claim 66, wherein said physiological disorder is selected from the group 
5 consisting of disorders associated with cholesterol homeostasis and atherogenesis. 

68. A method for selecting modulators of a protein complex formed from a first protein or a 
homologue or derivative or fragment thereof, and a second protein or a homologue or 
derivative or fragment thereof, wherein said first and second proteins are selected from the 

10 group consisting of the proteins of claim 1, said method comprising: 

contacting said protein complex with a test compound; and 
determining the interaction between said first protein and said second protein. 

69. A modulator useful for treating a physiological disorder identified by the method of claim 
15 68. 

70. The modulator of claim 69, wherein said physiological disorder is selected from the group 
consisting of disorders associated with cholesterol homeostasis and atherogenesis. 

20 71. A method for selecting modulators of an interaction between a first polypeptide and a second 



polypeptide, said first polypeptide being a first protein or a homologue or derivative or 
fragment thereof and said second polypeptide being a second protein or a homologue or 
derivative or fragment thereof, wherein said first and second proteins are selected from the 
group consisting of the proteins of claim 1, said method comprising: 

25 providing in a host cell a first fusion protein having said first polypeptide, and a 

second fusion protein having said second polypeptide, wherein a DNA binding domain is 
fused to one of said first and second polypeptides while a transcription-activating domain 
is fused to the other of said first and second polypeptides; 

providing in said host cell a reporter gene, wherein the transcription of the reporter 

30 gene is determined by the interaction between the first polypeptide and the second 

polypeptide; 
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allowing said first and second fusion proteins to interact with each other within said 
host cell in the presence of a test compound; and 

determining the presence or absence of expression of said reporter gene, 

72. The method of claim 71, wherein said host cell is a yeast cell. 

•73 . A modulator useful for treating a physiological disorder identified by the method of claim 
71. 

74. The modulator of claim 73, wherein said physiological disorder is selected from the group 
consisting disorders associated with cholesterol homeostasis and atherogenesis. 

75. A method for identifying a compound that binds to a protein in vitro, wherein said protein 
is selected from the group consisting of the proteins of claim 1, said method comprising: 

contacting a test compound with said protein for a time sufficient to form a complex 

and 

detecting for the formation of a complex by detecting said protein or the compound 
in the complex, 

so that if a complex is detected, a compound that binds to protein is identified. 

76. A compound useful for treating a physiological disorder identified by the method of claim 
75. 

77. The compound of claim 76, wherein said physiological disorder is selected from the group 
consisting of disorders associated with cholesterol homeostasis and atherogenesis. 

78. A method for selecting modulators of an interaction between a first polypeptide and a second 
polypeptide, said first polypeptide being a first protein or a homologue or derivative or 
fragment thereof and said second polypeptide being a second protein or a homologue or 
derivative or fragment thereof, wherein said first and second proteins are selected from the 
group consisting of the proteins of claim 1, said method comprising: 
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providing atomic coordinates defining a three-dimensional structure of a protein 
complex formed by said first polypeptide and said second polypeptide; and 

designing or selecting compounds capable of modulating the interaction between a 
first polypeptide and a second polypeptide based on said atomic coordinates. 

5 

79. A modulator useful for treating a physiological disorder identified by the method of claim 
78. 



80. The modulator of claim 79, wherein said physiological disorder is selected from the group 
1 0 consisting of disorders associated with cholesterol homeostasis and atherogenesis. 

81. A method for providing inhibitors of an interaction between a first polypeptide and a second 
polypeptide, said first polypeptide being a first protein or a homologue or derivative or 
fragment thereof and said second polypeptide being a second protein or a homologue or 

1 5 derivative or fragment thereof, wherein said first and second proteins are selected from the 

group consisting of the proteins of claim 1, said method comprising: 

providing atomic coordinates defining a three-dimensional structure of a protein 
complex formed by said first polypeptide and said second polypeptide; and 

designing or selecting compounds capable of interfering with the interaction between 
20 a first polypeptide and a second polypeptide based on said atomic coordinates. 

82. An inhibitor useful for treating a physiological disorder identified by the method of claim 
81. 

25 83. The inhibitor of claim 82, wherein said physiological disorder is selected from the group 
consisting of disorders associated with cholesterol homeostasis and atherogenesis, 

84. A method for selecting modulators of a protein, wherein said protein is selected from the 
group consisting of the proteins of claim 1 , said method comprising: 
30 contacting said protein with a test compound; and 

determining binding of said test compound to said protein. 
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85 . The method of claim 84, wherein said test compound is provided in a phage display library. 

86. The method of claim 84, wherein said test compound is provided in a combinatorial library. 

87. A modulator useful for treating a physiological disorder identified by the method of claim 
84. 

88. The modulator of claim 87, wherein said physiological disorder is selected from the group 
consisting of disorders associated with cholesterol homeostasis and atherogenesis. 

89. A method for modulating, in a cell, a protein complex having a first protein interacting with 
a second protein, wherein said first and second proteins are selected from the group 
consisting of the proteins of claim 1, said method comprising: 

administering to said cell a compound capable of modulating said protein complex. 

90. The method of claim 89, wherein said compound is selected from the group consisting of: 

(a) a cqmpound which is capable of interfering with the interaction between said first 
protein and said second protein, 

(b) a compound which is capable of binding at least one of said first protein and said 
second protein, 

(c) a compound which comprises a peptide having a contiguous span of amino acids 
of at least 4 amino acids of siad second protein and capable of binding said first protein, 

(d) a compound which comprises a peptide capable of binding said first protein and 
having an amino acid sequence of from 4 to 30 amino acids that is at least 75% identical to 
a contiguous span of amino acids of said second protein of the same length, 

(e) a compound which comprises a peptide having a contiguous span of amino acids 
of at least 4 amino acids of said first protein and capable of binding said second protein, 

(f) a compound which comprises a peptide capable of binding said second protein 
and having an amino acid sequence of from 4 to 30 amino acids that is at least 75% identical 
to a contiguous span of amino acids of said first protein of the same length, 

(g) a compound which is an antibody immunoreactive with said first protein or said 
second protein, 
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(h) a compound which is a nucleic acid encoding an antibody immunoreacttve with 
said first protein or said second protein, 

(i) a compound which modulates the expression of said first protein or said second 

protein, 

(j) a compound which is an antisense compound or a ribozyme specifically 
hybridizing to a nucleic acid encoding said first protein or complement thereof, and 

(k) a compound which is an antisense compound or a ribozyme specifically 
hybridizing to a nucleic acid encoding said second protein or complement thereof. 

91. A method for modulating, in a cell, a protein complex having a first protein interacting with 
a second protein, wherein said first and second proteins are selected from the group 
consisting of the proteins 'of claim 1, said method comprising: 

administering to said cell a peptide capable of interfering with the interaction 
between said first protein and said second protein, wherein said peptide is associated with 
a transporter capable of increasing cellular uptake of said peptide. 

92. The method of claim 9 1 , wherein said peptide is covalently linked to said transporter which 
is selected from the group consisting of penetratins, /-Tat 49 _ 57 , tf-Tat 49 , 57 , retro-inverso isomers 
of/- or rf-Tat 49 _ 57 , L-arginine oligomers, D- arginine oligomers, L-lysine oligomers, D-lysine 

• oligomers, L-histine oligomers, D-histine oligomers, L-ornithine oligomers, D-ornithine 
oligomers, short peptide sequences derived from fibroblast growth factor, Galparan, and 
HSV-1 structural protein VP22, and peptoid analogs thereof 

93 . A method for modulating, in a cell, the interaction of a protein with a ligand, wherein said 
protein is selected from the group consisting of the first or second proteins of claim 1, said 
method comprising: 

administering to said cell a compound capable of modulating said interaction. 

94. The method of claim 93, wherein said protein is one of said first or second proteins and said 
ligand is the other of said first or second proteins 

95. The method of claim 93, wherein said compound is selected from the group consisting of: 
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(a) a compound which interferes with said interaction, 

(b) a compound which binds to said protein or said ligand, 

(c) a compound which comprises a peptide having a contiguous span of amino acids 
of at least 4 amino acids of said protein and capable of binding said ligand, 

5 (d) a compound which comprises a peptide capable of binding said ligand and having 

an amino acid sequence of from 4 to 30 amino acids that is at least 75% identical to a 
contiguous span of amino acids of said protein of the same length, 

(e) a compound which is an antibody immunoreactive with said protein or said 

ligand, 

10 (f) a compound which is a nucleic acid encoding an antibody immunoreactive with 

said ligand or said protein, 

(g) a compound which modulates the expression of said protein or said ligand, and 

(h) a compound which is an antisense compound or a ribozyme specifically 
hybridizing to a nucleic acid encoding said ligand or said protein or complement thereof. 

15 

96. A method for modulating neuronal death in a patient having a physiological disorder 
comprising: 

modulating a protein complex having a first protein interacting with a second 
protein, wherein said first and second proteins are selected from the group consisting of the 
20 proteins of claim 1 . 

97. The method of claim 96, wherein said physiological disorder is selected from the group 
consisting of disorders associated with cholesterol homeostasis and atherogenesis. 



25 98. A method for modulating neuronal death in a patient having physiological disorder 
comprising: 

administering to the patient a compound capable of modulating a protein complex 
having a first protein interacting with a second protein, wherein said first and second 
proteins are selected from the group consisting of the proteins of claim 1. 

30 

99. The method of claim 98, wherein said physiological disorder is selected from the group 
consisting of disorders associated with cholesterol homeostasis and atherogenesis. 
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100. The method of claim 98, wherein said compound is selected from the group consisting of: 

(a) a compound which is capable of interfering with the interaction between said first 
protein and said second protein, 

(b) a compound which is capable of binding at least one of said first protein and said 
second protein, 

(c) a compound which comprises a peptide having a contiguous span of amino acids 
of at least 4 amino acids of a second protein and capable of binding a first protein, 

(d) a compound which comprises a peptide capable of binding a first protein and 
having an amino acid sequence of from 4 to 30 amino acids that is at least 75% identical to 
a contiguous span of amino acids of a second protein of the same length, 

(e) a compound which comprises a peptide having a contiguous span of amino acids 
of at least 4 amino acids of first protein and capable of binding a second protein, 

(f) a compound which comprises a peptide capable of binding a second protein and 
having an amino acid sequence of from 4 to 30 amino acids that is at least 75% identical to 
a contiguous span of amino acids of a first protein of the same length, 

(g) a compound which is an antibody immunoreactive with a first protein or a second 
protein, 

(h) a compound which is a nucleic acid encoding an antibody immunoreactive with 
a first protein or a second protein, 

(i) a compound which modulates the expression of a first protein or a second protein, 
(j) a compound which is an antisense compound or a ribozyme specifically 

hybridizing to a nucleic acid encoding a first protein or complement thereof, and 

(j) a compound which is an antisense compound or a ribozyme specifically 
hybridizing to a nucleic acid encoding a second protein or complement thereof 

101. A method for modulating neuronal death in a patient having physiological disorder 
comprising: 

administering to said cell a peptide capable of interfering with the interaction 
between a first protein and a second protein, wherein said first and second proteins are 
selected from the group consisting of the proteins of claim 1, wherein said peptide is 
associated with a transporter capable of increasing cellular uptake of said peptide. 
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The method of claim 101, wherein said peptide is covalently linked to said transporter which 
is selected from the group consisting of penetratins, /-Tat 49 . 57 , rf-Tat 49 . 57) retro-inverso isomers 
of/- or <f-Tat 49 _ 57 , L-arginine oligomers, D- arginine oligomers, L-lysine oligomers, D-lysine 
oligomers, L-histine oligomers, D-histine oligomers, L-ormthine oligomers, D-ornithine 
oligomers, short peptide sequences derived from fibroblast growth factor, Galparan, and 
HSV-1 structural protein VP22, and peptoid analogs thereof. 

A method for treating a physiological disorder comprising: 

administering to a patient in need of treatment a compound capable of modulating 
a protein complex having a first protein interacting with a second protein, wherein said first 
and second proteins are selected from the group consisting of the proteins of claim 1. 

104. The method of claim 103, wherein said physiological disorder is selected from the group 
15 consisting of disorders associated with cholesterol homeostasis and atherogenesis. 

1 05 . The method of claim 1 03 , wherein said compound is selected from the group consisting of: 

(a) a compound which is capable of interfering with the interaction between said first 
protein and said second protein, 
20 (b) a compound which is capable of binding at least one of said first protein and said 

second protein, 

(c) a compound which comprises a peptide having a contiguous span of amino acids 
of at least 4 amino acids of said second protein and capable of binding said first protein, 

(d) a compound which comprises a peptide capable of binding said first protein and 
25 having an amino acid sequence of from 4 to 30 amino acids that is at least 75% identical to 

a contiguous span of amino acids of said second protein of the same length, 

(e) a compound which comprises a peptide having a contiguous span of amino acids 
of at least 4 amino acids of first protein and capable of binding said second protein, 

(f) a compound which comprises a peptide capable of binding said second protein 
30 and having an amino acid sequence of from 4 to 30 amino acids that is at least 75% identical 

to a contiguous span of amino acids of said first protein of the same length, 



102. 



5 



103. 

10 
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(g) a compound which is an antibody immunoreactive with siad first protein or said 
second protein, 

(h) a compound which is a nucleic acid encoding an antibody iminunoreactive with 
said first protein or said second protein, 

(i) a compound which modulates the expression of said first protein or said second 
protein, 

(j) a compound which is an antisense compound or a ribozyme specifically 
hybridizing to a nucleic acid encoding a first protein or complement thereof, 

(k) a compound which is an antisense compound or a ribozyme specifically 
hybridizing to a nucleic acid encoding a second protein or complement thereof, and 

(1) a compound which is capable of strengthening the interaction between said first 
protein and said second protein. 

106. A method for treating a physiological disorder comprising: 

administering to said cell a peptide capable of interfering with the interaction 
between a first protein and a second protein, wherein said first and second proteins are 
selected from the group consisting of the proteins of claim 1, wherein said peptide is 
associated with a transporter capable of increasing cellular uptake of said peptide. 

107. The method of claim 106, wherein said peptide is covalently linked to said transporter which 
is selected from the group consisting of penetratins, /-Tat 49 _ 57 , d-Tat^-n* retro-inverso isomers 
of/- or G?-Tat 4g _ 57 , L-arginine oligomers, D- arginine oligomers, L-lysine oligomers, D-lysine 
oligomers, L-histine oligomers, D-histine oligomers, L-ornithine oligomers, D-ornithine 
oligomers, short peptide sequences derived from fibroblast growth factor, Galparan, and 
HSV-1 structural protein VP22, and peptoid analogs thereof 

108. The method of claim 106, wherein said physiological disorder is selected from the group 
consisting of disorders associated with cholesterol homeostasis and atherogenesis. 

109. A method for treating a physiological disorder comprising: 
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administering to a patient in need of treatment a compound capable of modulating 
the activity of a first protein or a second protein, wherein said first and second proteins are 
selected from the group consisting of the proteins of claim 1. 

5 110. The method of claim 1 09, wherein said physiological disorder is selected from the group 
consisting of disorders associated with cholesterol homeostasis and atherogenesis. 

111. The method of claim 1 09, wherein the activity is the interaction of said first protein or said 
second protein with a ligand. 



10 



1 12. The method of claim 111, wherein said ligand is the other of said first or second protein. 



113. A method of modulating activity in a cell of a protein, said protein being first protein or a 
second protein selected from the group consisting of the proteins of claim 1, said method 

15 comprising: 

administering to said cell a compound capable of modulating said protein. 

1 14. The method of claim 1 13, wherein said compound is selected from the group consisting of: 

(a) a compound which is capable of binding said protein, 
20 (b) a compound which comprises a peptide having a contiguous span of at least 4 

amino acids of a first protein and capable of binding a second protein, 

(c) a compound which comprises a peptide capable of binding a second protein and 
having an amino acid sequence of from 4 to 30 amino acids that is at least 75% identical to 
a contiguous span of amino acids of a first protein of the same length, 
25 (d) a compound which is an antibody immunoreactive with said protein, 

(e) a compound which is a nucleic acid encoding an antibody immunoreactive with 
said protein, and 

(£) a compound which is an antisense compound or a ribozyme specifically 
hybridizing to a nucleic acid encoding said protein or complement thereof. 

30 
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115. A method for modulating activities of a protein in a cell, said protein being a first protein 
or a second protein selected from the group consisting of the proteins of claim 1, said 
method comprising: 

administering to said cell a peptide having a contiguous span of at least 4 amino acids 
of one of said first or second proteins and capable of binding the other of said first or second 
proteins, wherein said peptide is associated with a transporter capable of increasing cellular 
uptake of said peptide. 

116. The method of claim 1 1 5, wherein said peptide is covalently linked to said transporter which 
is selected from the group consisting of penetratins, /-Tat 49 _ 57 , J-Tat 49 . 57 , retro-inverso isomers 
of/- or </-Tat 49 _ 57 , L-arginine oligomers, D- arginine oligomers, L-lysine oligomers, D-lysine 
oligomers, L-histine oligomers, D-histine oligomers, L-ornithine oligomers, D-ornithine 
oligomers, short peptide sequences derived from fibroblast growth factor, Galparan, and 
HSV-1 structural protein VP22, and peptoid analogs thereof. 

117. An isolated nucleic acid encoding a protein comprising an amino acid sequence set forth in 
SEQIDNO:4. 

118. The isolated nucleic acid sequence of claim 1 1 7 which comprises nucleotides 544-6960 of 
SEQ ID NO:3 or complement thereof. 

119. An isolated nucleic acid encoding a protein comprising an amino acid sequence which is at 
least 70% identical to the amino acid sequence set forth in SEQ ID NO:4 and which is 
capable of interacting with LXR-alpha. 

120. An isolated nucleic acid comprising a nucleotide sequence which is at least 60% identical 
to nucleotides 544-6960 of SEQ ID NO:3 or complement thereof. 

121. An isolated nucleic acid sequence comprising a nucleotide sequence set forth in SEQ ID 
NO:3 or complement thereof. 
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122. An isolated nucleic acid comprising a contiguous span of at least 17 nucleotides of the 
nucleotide sequence set forth in SEQ ED NO:3 or complement thereof. 

123. The isolated nucleic acid of claim 122 comprising at least 21 nucleotides. 

5 

124. The isolated nucleic acid of claim 122 comprising at least 25 nucleotides. 

125. The isolated nucleic acid of claim 122 comprising at least 30 nucleotides. 

10 126. The isolated nucleic acid of claim 122 comprising at least 50 nucleotides. 

127. An isolated nucleic acid comprising at least 21 nucleotides that encodes a contiguous span 
of at least 7 amino acids of the amino acid sequence set forth in SEQ ID NO:4. 

15 128. The isolated nucleic acid of claim 127 encoding at least 8 contiguous amino acids. 

129. The isolated nucleic acid of claim 127 encoding at least 9 contiguous amino acids. 

130, The isolated nucleic acid of claim 127 encoding at least 10 contiguous amino acids. 

20 

13L The isolated nucleic acid of claim 127 encoding at least 15 contiguous amino acids. 

132. The isolated nucleic acid of claim 127 encoding at least 20 contiguous amino acids, 

25 133. The isolated nucleic acid of claim 127 encoding at least 25 contiguous amino acids. 

134. A nucleic acid vector comprising the isolated nucleic acid of claim 117. 

135. A nucleic acid vector comprising the isolated nucleic acid of claim 118. 

136. A nucleic acid vector comprising the isolated nucleic acid of claim 119. 



30 



10 
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137. A nucleic acid vector comprising the isolated nucleic acid of claim 124. 

138, A nucleic acid vector comprising the isolated nucleic acid of claim 1 30. 
5 139. A host cell comprising the isolated nucleic acid of claim 117. 

140. A host cell comprising the isolated nucleic acid of claim 118. 

141 . A host cell comprising the isolated nucleic acid of claim 119. 

142. A host cell comprising the isolated nucleic acid of claim 116, 

143. A host cell comprising the isolated nucleic acid of claim 130. 
15 144. A microarray comprising the isolated nucleic acid of claim 130. 

145. An isolated polypeptide comprising an amino acid sequence set forth in SEQ ID NO:4. 

146. An isolated polypeptide comprising an amino acid sequence that is at least 70% identical to 
20 the amino acid sequence set forth in SEQ ID NO:4 and capable of interacting with LXR- 

alpha. 

147. An isolated polypeptide comprising a contiguous span of at least 8 amino acids of the amino 
acid sequence set forth in SEQ ID NO:4. 

25 

148. The isolated polypeptide of claim 147 comprising a contiguous span of at least 10 amino 
acids. 

149. The isolated polypeptide of claim 147 comprising a contiguous span of at least 12 amino 
30 acids. 
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150. The isolated polypeptide of claim 147 comprising a contiguous span of at least 15 amino 
acids. 

151. The isolated polypeptide of claim 147 comprising a contiguous span of at least 17 amino 
5 acids. 

152. The isolated polypeptide of claim 147 comprising a contiguous span of at least 20 amino 
acids. 

10 153. An isolated polypeptide comprising an amino acid sequence of from 4 to 30 amino acids that 
is at least 75% identical to a contiguous span of amino acids of the amino acid sequence set 
forth in SEQ ID NO:4 of the same length, wherein said isolated polypeptide is capable of 
interacting with LXR-alpha. 

15 154. The isolated polypeptide of claim 153, wherein said amino acid sequence comprises from 
8 to 20 amino acids. 

155. An antibody which is specifically immunoreactive with the isolated polypeptide of claim 
145. 

20 

156. An antibody which is specifically immunoreactive with the isolated polypeptide of claim 
147. 

157. A protein microarray comprising the isolated polypeptide of claim 145. 

25 

158. A protein microarray comprising the isolated polypeptide of claim 147. 

159. A protein microarray comprising the isolated polypeptide of claim 1 54, 



30 160. 



A method for making an isolated polypeptide comprising an amino acid sequence set forth 
in SEQ ID NO:4, comprising: 



WO 02/055657 



PCT/US01/48561 



55 

providing an expression vector comprising a nucleic acid encoding said amino acid 
sequence; and 

introducing said expression vector into a host cell such that said host cell producing 
the isolated polypeptide. 

5 
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SEQUENCE LISTING 



<110> Myriad Genetics, Inc. 
Cimbora, Daniel M. 
Heichman, Karen 
Bartel, Paul L. 



<120> 


Protein-Protein Interactions 


<130> 


2318-282-11 


<150> 


US 60/256,983 


<151> 


2000-12-21 


<160> 


4 


<170> 


Patentln version 3.0 


<210> 


1 


<211> 


40 


<212> 


DNA 


<213> 


Artificial 


<220> 




<223> 


oligonucleotide primer 


<400> 


1 



gcaggaaaca gctatgacca tacagtcagc ggccgccacc 4 0 

<210> 2 

<211> 39 

<212> DNA 

<213> Artificial 

<220> 

<223> oligonucleotide primer 
<4O0> 2 

acggccagtc gcgtggagtg ttatgtcatg cggccgcta 39 

<210> 3 

<211> 10625 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (544) . . (6960) 
<400> 3 

cttattttga aaacatttac atagtgatta gttaacccaa cagaccaatc ctgggaagac 60 

agccagagcc tgcagcacct tagtaacaga aaaactgata attaggagaa gagacctgtc 120 

caagaccagg aacctggacc aaaattgtgc catgttgctt tactttaatg agtggcccca 180 

gtaaaaactg agctgtatgg cagagctgtt cacatttatc ttctgtgtcc acccagttct 240 

gctgaaaccc ctggcaagat cgtggccctg ttgtagcttg tcatgttttg aacagctgtc 300 

tatggaaaga aagcaaacac aacctagagc aacattgatt tgttttagaa agctctttta 360 
ttttcagttc tggctgtgtt caacatctta gcttacgttt ttcatgttgt aatgatctgc 420 



cgtatggacg atcacctcta agttagagag ttctgtaatt tggcttggat taaagatgct 



480 
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tggttagtga aagctgctgc tttttttata gtcaaaggac tggttctgag agccttgttg 54 0 

cag atg get gag gtc acc gtc cca agg gtg tat gtc gtg ttt ggc ate 588 
Met Ala Glu Val Thr Val Pro Arg Val Tyr Val Val Phe Gly lie 
15 10 15 

cat tgc ate atg gcg aag gca tct tea gat gtg cag gtt tea ggc ttt 636 
His Cys lie Met Ala Lys Ala Ser Ser Asp Val Gin Val Ser Gly Phe 
20 25 30 

cat egg aaa ate cag cac gtt aaa aat gaa ctt tgc cac atg ttg age 684 
His Arg Lys He Gin His Val Lys Asn Glu Leu Cys His Met Leu Ser 
35 40 45 

ttg gag gag gtg gec cca gtg ctg cag cag aca tta ctt cag gac aac 732 
Leu Glu Glu Val Ala Pro Val Leu Gin Gin Thr Leu Leu Gin Asp Asn 
50 55 60 

etc ttg ggc agg gta cat ttt gac caa ttt aaa gaa gca tta ata etc 780 
Leu Leu Gly Arg Val His Phe Asp Gin Phe Lys Glu Ala Leu He Leu 
65 70 75 

ate ttg tec aga act ctg tea aat gaa gaa cac ttt caa gaa cca gac 828 
He Leu Ser Arg Thr Leu Ser Asn Glu Glu His Phe Gin Glu Pro Asp 
80 85 90 95 

tgc tea eta gaa get cag ccc aaa tat gtt aga ggt ggg aag cgt tac .87 6 

Cys Ser Leu Glu Ala Gin Pro Lys Tyr Val Arg Gly Gly Lys Arg Tyr 
100 105 HO 

gga cga agg tec ttg ccc gag ttc caa gag tec gtg gag gag ttt cct 924 
Gly Arg Arg Ser Leu Pro Glu Phe Gin Glu Ser Val Glu Glu Phe Pro 
115 120 ' 125 

gaa gtg acg gtg att gag cca ctg gat gaa gaa gcg egg cct tea cac 972 
Glu Val Thr Val He Glu Pro Leu Asp Glu Glu Ala Arg Pro Ser His 
130 135 140 

ate cca gec ggt gac tgc agt gag cac tgg aag acg caa cgc agt gag 1020 
He Pro Ala Gly Asp Cys Ser Glu His Trp Lys Thr Gin Arg Ser Glu 
145 150 155 

gag tat gaa gcg gaa ggc cag tta agg ttt tgg aac cca gat gac ttg 1068 
Glu Tyr Glu Ala Glu Gly Gin Leu Arg Phe Trp Asn Pro Asp Asp Leu 
160 165 170 175 

aat get tea cag agt gga tct tec cct ccc caa gac tgg ata gaa gag 1116 
Asn Ala Ser Gin Ser Gly Ser Ser Pro Pro Gin Asp Trp He Glu Glu 
180 185 190 

aaa ctg caa gaa gtt tgt gaa gat ttg ggg ate acc cgt gat ggt cac 1164 
Lys Leu Gin Glu Val Cys Glu Asp Leu Gly He Thr Arg Asp Gly His 
195 200 205 

ctg aac egg aag aag ctg gtc tec ate tgt gag cag tat ggt tta cag 1212 
Leu Asn Arg Lys Lys Leu Val Ser He Cys Glu Gin Tyr Gly Leu Gin 

210 215 220 

aat gtg gat gga gag atg -etc gag ■ gaa gta ttc cat aat ctt gat cct 1260 
Asn Val Asp Gly Glu Met Leu Glu Glu Val Phe His Asn Leu Asp Pro 
225 230 235 

gac ggt aca atg agt gta gaa gat ttt ttc tat ggt ttg ttt aaa aat 1308 
Asp Gly Thr Met Ser Val Glu Asp Phe Phe Tyr Gly Leu Phe Lys Asn 
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240 245 250 255 

gga aaa tct ctt aca cca tea gca tct act cca tat aga caa eta aaa 1356 
Gly Lys Ser Leu Thr Pro Ser Ala Ser Thr Pro Tyr Arg Gin Leu Lys 
260 265 270 

agg cac ctt tec atg cag tct ttc gat gag agt gga cga cgt acc aca 1404 
Arg His Leu Ser Met Gin Ser Phe Asp Glu Ser Gly Arg Arg Thr Thr 
275 280 285 

acc tea tea gca atg aca agt ace att ggc ttt egg gtc ttc tec tgc 1452 
Thr Ser Ser Ala Met Thr Ser Thr lie Gly Phe Arg Val Phe Ser Cys 
290 295 300 

ctg gat gat ggg atg ggc cat gca tct gtg "gag aga ata ctg gac acc 1500 
Leu Asp Asp Gly Met Gly His Ala Ser Val Glu Arg lie Leu Asp Thr 
305 310 315 

tgg cag gaa gag ggc att gag aac age cag gag ate ctg aag gec ttg 1548 
Trp Gin Glu Glu Gly lie Glu Asn Ser Gin Glu lie Leu Lys Ala Leu 
320 325 330 335 

gat ttc age etc gat gga aac ate aat ttg aca gaa tta aca ctg gee 1596 
Asp Phe Ser Leu Asp Gly Asn He Asn Leu Thr Glu Leu Thr Leu Ala 
340 345 350 

ctt gaa aat gaa ctt ttg gtt acc aag aac age att cac cag gcg get 164 4 

Leu Glu Asn Glu Leu Leu Val Thr Lys Asn Ser He His Gin Ala Ala 
355 360 365 

ctg gec age ttt aag get gaa ate egg cat ttg ttg gaa cga gtt gat 1692 
Leu Ala Ser Phe Lys Ala Glu He Arg His Leu Leu Glu Arg Val Asp 
370 375 380 

cag gtg gtc aga gaa aaa gag aag eta egg tea gat ctg gac aag gee 1740 
Gin Val Val Arg Glu Lys Glu Lys Leu Arg Ser Asp Leu Asp Lys Ala 
385 390 395 

gag aag etc aag tct tta atg gee teg gag gtg gat gat cac cat gcg 1788 
Glu Lys Leu Lys Ser Leu Met Ala Ser Glu Val Asp Asp His His Ala 
400 405 410 415 

gee ata gag egg egg aat gag tac aac etc agg aaa ctg gat gga gag 1836 
Ala He Glu Arg Arg Asn Glu Tyr Asn Leu Arg Lys Leu Asp Gly Glu 
420 425 430 

tac aag gag cga ata gca gee tta aaa aat gaa etc cga aaa gag aga 1884 
Tyr Lys Glu Arg He Ala Ala Leu Lys Asn Glu Leu Arg Lys Glu Arg 
435 440 445 

gag cag ate ctg cag cag gca ggc aag cag cgt tta gaa ctt gaa cag 1932 
Glu Gin He Leu Gin Gin Ala Gly Lys Gin Arg Leu Glu Leu Glu Gin 
450 455 460 

gaa att gaa aag gca aaa aca gaa gag aac tat ate egg gac cgc ctt 1980 
Glu He Glu Lys Ala Lys Thr Glu Glu Asn Tyr He Arg Asp Arg Leu 
465 470 475 

gec etc tct tta aag gaa aac agt cgt ctg gaa aat gag ctt eta gaa 2028 
Ala Leu Ser Leu Lys Glu Asn Ser Arg Leu Glu Asn Glu Leu Leu Glu 
480 485 490 495 

aat gca gag aag ttg gca gaa tat gag aat ctg aca aac aaa ctt cag 2076 
Asn Ala Glu Lys Leu Ala Glu Tyr Glu Asn Leu Thr Asn Lys Leu Gin 



WO 02/055657 



PCT/U SO 1/48561 



500 505 510 

aga aat ttg gaa aat gtg tta gca gaa aag ttt ggt gac etc gat cct 
Arg Asn Leu Glu Asn Val Leu Ala Glu Lys Phe Gly Asp Leu Asp Pro 
515 520 525 

age agt get gag ttc ttc ctg caa gaa gag aga ctg aca cag atg aga 
Ser Ser Ala Glu Phe Phe Leu Gin Glu Glu Arg Leu Thr Gin Met Arg 
530 535 540 

aat gaa tat gag egg cag tgc agg gta eta caa gac caa gta gat gaa 
Asn Glu Tyr Glu Arg Gin Cys Arg Val Leu Gin Asp Gin Val Asp Glu 
545 550 555 

etc cag tct gag ctg gaa gaa tat cgt gca caa ggc aga gtg etc agg 
Leu Gin Ser Glu Leu Glu Glu Tyr Arg Ala Gin Gly Arg Val Leu Arg 
560 565 570 575 

ctt ccg ttg aag aac tea ccg tea gaa gaa gtt gag get aac age ggt 
Leu Pro Leu Lys Asn Ser Pro Ser Glu Glu Val Glu Ala Asn Ser Gly 
580 585 590 

ggc att gag ccc gaa cac ggg etc ggt tct gaa gaa tgc aat cca ttg 
Gly lie Glu Pro Glu His Gly Leu Gly Ser^Glu Glu Cys Asn Pro Leu 
595 . 600 605 

aat atg age att gag gca gag ctg gtc att gaa cag atg aaa gaa caa 
Asn Met Ser He Glu Ala Glu Leu Val He Glu Gin Met Lys Glu Gin 
610 615 620 

cat cac agg gac ata tgt tgc etc aga ctg gag etc gaa gat aaa gtg 

His His Arg Asp He Cys Cys Leu Arg Leu Glu Leu Glu Asp Lys Val 

625 630 635 
s20 

cgc cat tat gaa aag cag ctg gac gaa ace gtg gtc age tgc aag aag 

Arg His Tyr Glu Lys Gin Leu Asp Glu Thr Val Val Ser Cys Lys Lys 
640 645 650 655 

gca cag gag aac atg aag caa agg cat gag aac gaa acg cgc acc tta 
Ala Gin Glu Asn Met Lys Gin Arg His Glu Asn Glu Thr Arg Thr Leu 
660 665 670 

gaa aaa caa ata agt gac ctt aaa aat gaa att get gaa ctt cag ggg 
Glu Lys Gin He Ser Asp Leu Lys Asn Glu He Ala Glu Leu Gin Gly 
675 680 685 

caa gca gca gtg etc aag gag gca cat cat gag gec act tgc agg cat 
Gin Ala Ala Val Leu Lys Glu Ala His His Glu Ala Thr Cys Arg His 
690 695 700 

gag gag gag aaa aaa caa ctg caa gtg aag ctt gag gag gaa aag act 
Glu Glu Glu Lys Lys Gin Leu Gin Val Lys Leu Glu Glu Glu Lys Thr 
705 710 715 

cac ctg cag gag aag ctg agg ctg caa cat gag atg gag etc aag get 
His Leu Gin Glu Lys Leu Arg Leu Gin His Glu Met Glu Leu Lys Ala 
720 725 730 735 

aga ctg aca cag get caa gca age ttt gag egg gag agg gaa ggc ctt 
Arg Leu Thr Gin Ala Gin Ala Ser Phe Glu Arg Glu Arg Glu Gly Leu 
740 745 750 



cag agt age gec tgg aca gaa gag aag gtg aga ggc ttg act cag gaa 
Gin Ser Ser Ala Trp Thr Glu Glu Lys Val Arg Gly Leu Thr Gin Glu 



2124 

2172 

2220 

2268 

2316 

2364 

2412 

2460 

2508 

2556 

2604 

2652 

2700 

2748 

2796 

2844 
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755 760 765 

eta gag cag ttt cac cag gag cag ctg aca age ctg gtg gag aaa cac 2892 
Leu Glu Gin Phe His Gin Glu Gin Leu Thr Ser Leu Val Glu Lys His 
770 -775 780 

act ctt gag aaa gag gag tta aga aaa gag etc ttg gaa aag cac caa 2940 
Thr Leu Glu Lys Glu Glu Leu Arg Lys Glu Leu Leu Glu Lys His Gin 
785 790 795 

agg gag ctt cag gag gga agg gaa aaa atg gaa aca gag tgt aat aga 2988 
Arg Glu Leu Gin Glu Gly Arg Glu Lys Met Glu Thr Glu Cys Asn Arg 
800 805 810 815 

aga acc tct caa ata gaa gec cag ttt cag tct gat tgt cag aaa gtc 3036 
Arg Thr Ser Gin He Glu Ala Gin Phe Gin Ser Asp Cys Gin Lys Val 
820 825 830 

act gag agg tgt gaa age get ctg caa age ctg gag ggg cgc tac cgc 308 4 

Thr Glu Arg Cys Glu Ser Ala Leu Gin Ser Leu Glu Gly Arg Tyr Arg 
835 840 845 

caa gag ctg aag gac etc cag gaa cag cag cgt gag gag aaa tec cag 3132 
Gin Glu Leu Lys Asp Leu Gin Glu Gin Gin Arg Glu Glu Lys Ser Gin 
850 855 860 

tgg gaa ttt gag aag gac gag etc acc cag gag tgt gcg gaa gee cag 318 0 

Trp Glu Phe Glu Lys Asp Glu Leu Thr Gin Glu Cys Ala Glu Ala Gin 
865 870 875 

gag ctg ctg aaa gag act ctt aag aga gag aaa aca act tct ctg gtc 3228 
Glu Leu Leu Lys Glu Thr Leu Lys Arg Glu Lys Thr Thr Ser Leu Val 
880 885 ■ 890 895 

ctg acc cag gag aga gag atg ctg gag aaa aca tac aaa gaa cat ttg 327 6 

Leu Thr Gin Glu Arg Glu Met Leu Glu Lys Thr Tyr Lys Glu His Leu 
900 905 910 

aac age atg gtc gtc gag aga cag cag eta etc caa gac ctg gaa gac 3324 
Asn Ser Met Val Val Glu Arg Gin Gin Leu Leu Gin Asp Leu Glu Asp 
915 920 925 

eta aga aat gta tct gaa acc cag caa age ctg ctg tct gac cag ata 337 2 

Leu Arg Asn Val Ser Glu Thr Gin Gin Ser Leu Leu Ser Asp Gin He 
930 935 940 

ctt gag ctg aag age agt cac aaa agg gaa ctg agg gag cgt gag gag 3420 
Leu Glu Leu Lys Ser Ser His Lys Arg Glu Leu Arg Glu Arg Glu Glu 

945 950 955 

gtc ctg tgc cag gca ggg get teg gag cag ctg gee age cag egg ctg 34 68 

Val Leu Cys Gin Ala Gly Ala Ser Glu Gin Leu Ala Ser Gin Arg Leu 
960 965 970 975 

gaa aga eta gaa atg gaa cat gac cag gaa agg cag gaa atg atg tec 3516 
Glu Arg Leu Glu Met Glu His Asp Gin Glu Arg Gin Glu Met Met Ser 
980 985 990 

aag ctt eta gee atg gag aac att cac aaa gcg acc tgt gag aca gca 3564 
Lys Leu Leu Ala Met Glu Asn He His Lys Ala Thr Cys Glu Thr Ala 
995 1000 1005 

gat cga gaa aga gee gag atg age aca gaa ate tec aga ctt cag ■ 3609 
Asp Arg Glu Arg Ala Glu Met Ser Thr Glu He Ser Arg Leu Gin 
1010 1015 1020 
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agt aaa ata aag gaa atg cag cag gca aca tct cct etc tea atg 3654 

Ser Lys He Lys Glu Met Gin Gin Ala Thr Ser Pro Leu Ser Met 
1025 1030 1035 

ctt cag agt ggt tgc cag gtg ata gga gag gag gag gtg gaa gga 3699 

Leu Gin Ser Gly Cys Gin Val He Gly Glu Glu Glu Val Glu Gly 
1040 1045 1050 

gat gga gec ctg tec ctg ctt cag caa ggg gag cag ctg ttg gaa 374 4 

Asp Gly Ala Leu Ser Leu Leu Gin Gin Gly Glu Gin Leu Leu Glu 
1055 1060 1065 

gaa aat ggg gac gtc etc tta age ctg cag aga get cat gaa cag 378 9 

Glu Asn Gly Asp Val Leu Leu Ser Leu Gin Arg Ala His Glu Gin 
1070 1075 1080 

gca gtg aag gaa aat gtg aaa atg get act gaa att tct aga ttg 3834 

Ala Val Lys Glu Asn Val Lys Met Ala Thr Glu He Ser Arg Leu 
1085 . 1090 1095 

caa cag agg eta caa aag tta gag cca ggg tta gta atg tct tct 387 9 

Gin Gin Arg Leu Gin Lys Leu Glu Pro Gly Leu Val Met Ser Ser 
1100 1105 1110 

tgt ttg gat gag cca get act gag ttt ttt gga aat act gcg gaa 3924 

Cys Leu Asp Glu Pro Ala Thr Glu Phe Phe Gly Asn Thr Ala Glu 
1115 1120 1125 

caa aca gag cag ttt tta cag caa aac cga acg aag caa gta gaa 396 9 

Gin Thr Glu Gin Phe Leu Gin Gin Asn Arg Thr Lys Gin Val Glu 
1130 1135 1140 

ggt gtg acc agg egg cat gtc eta agt gac ctg gaa gat gat gag 4 01*4 

Gly Val Thr Arg Arg His Val Leu Ser Asp Leu Glu Asp Asp Glu 
1145 1150 1155 



gtc egg gac ctg gga agt aca ggg acg age tct gtt cag aga cag 

Val Arg Asp Leu Gly Ser Thr Gly Thr Ser Ser Val Gin Arg Gin 
1160 1165 1170 

qaa gtc aaa ata gag gag tct gaa get tea gta gag ggt ttt tct 

Glu Val Lys He Glu Glu Ser Glu Ala Ser Val Glu Gly Phe Ser 
1175 1180 1185 

gag ctt gaa aac agt gaa gag acc agg act gaa tec tgg gag ctg 
Glu Leu Glu Asn Ser Glu Glu Thr Arg Thr Glu Ser Trp Glu Leu 
1190 1195 1200 



gcg gac tgt gat cga get tct gaa aag aaa cag gac eta ctt ttt 
Ala Asp Cys Asp Arg Ala Ser Glu Lys Lys Gin Asp Leu Leu Phe 
1220 1225 1230 

gat gtt tct gtg eta aaa aag aaa ctg aag atg ctt gag aga ate 
Asp Val Ser Val Leu Lys Lys Lys Leu Lys Met Leu Glu Arg He 
1235 1240 1245 

cct gag get tct ccc aaa tat aag ctg ttg tat gaa gat gtg age 
Pro Glu Ala Ser Pro Lys Tyr Lys Leu Leu Tyr Glu Asp Val Ser 
1250 1255 1260 



4059 



4104 



4149 



aag aat cag att agt cag ctt cag gaa cag eta atg atg tta tgt 4194 
Lys Asn Gin He Ser Gin Leu Gin Glu Gin Leu Met Met Leu Cys 
1205 1210 1215 



4239 



4284 



4329 
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cga gaa aat gac tgc ctt 
Arg Glu Asn Asp Cys Leu 
1265 

cgc tac gat gag gca eta 
Arg Tyr Asp Glu Ala Leu 
1280 

gtt ttc agg ttg cag gat 
Val Phe Arg Leu Gin Asp 
1295 

gaa aca ttc etc age ctg 
Glu Thr Phe Leu Ser Leu 
1310 

gaa aat gag ggg ctg aat 
Glu Asn Glu Gly Leu Asn 
1325 

att gag aag ctt cag gaa 
lie Glu Lys Leu Gin Glu 
1340 

tta tgg gaa gec agt tta 
Leu Trp Glu Ala Ser Leu 
1355 

aat ata etc cag etc aat 
Asn lie Leu Gin Leu Asn 
1370 

gtt agg agt gta cat cat 
Val Arg Ser Val His His 
1385 

cag tac ctt gag ggg aac 
Gin Tyr Leu Glu Gly Asn 
1400 

cat gaa att gee tgg tta 
His Glu lie Ala Trp Leu 
1415 

agg cca aga gta cag aat 
Arg Pro Arg Val Gin Asn 
1430 

etc eta ggc ttt caa gac 
Leu Leu Gly Phe Gin Asp 
1445 

gca gag tta gaa ctg gag 
Ala Glu Leu Glu Leu Glu 
1460 

aag ttg aag gag aga gtc 
Lys Leu Lys Glu Arg Val 
1475 

ctt tct cac gga gaa aag 
Leu Ser His Gly Glu Lys 
1490 
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cag gaa gag ctg aga atg 

Gin Glu Glu Leu Arg Met 
1270 

gaa aat aac aaa gaa etc 

Glu Asn Asn Lys Glu Leu 
1285 

gag ctg aag aaa atg gag 

Glu Leu Lys Lys Met Glu 
1300 

gaa aag agt tac gat gag 

Glu Lys Ser Tyr Asp Glu 
1315 

gtt ctg gtt ttg aga ctt 

Val Leu Val Leu Arg Leu 

1330 

age gtg gtc cag egg tgt 

Ser Val Val Gin Arg Cys 
1345 

gag aac ctg gaa ate gaa 

Glu Asn Leu Glu lie Glu 
1360 

cag aca ctg gaa gag tgt 

Gin Thr Leu Glu Glu Cys 
1375 

gtc ata gag gaa tgt aag 

Val lie Glu Glu Cys Lys 
1390 

aca cag etc ttg gaa aaa 

Thr Gin Leu Leu Glu Lys 
1405 

cat gga aca att cag aca 

His Gly Thr lie Gin Thr 
1420 

caa gtt ata ctg gag gaa 

Gin Val He Leu Glu Glu 
1435 

aaa cat ttt cag cat cag 

Lys His Phe Gin His Gin 
1450 

aaa aca aag tta cag gag 

Lys Thr Lys Leu Gin Glu 
1465 

act att tta gtt aag caa 

Thr He Leu Val Lys Gin 
1480 

gag gaa gag ctg aag gca 

Glu Glu Glu Leu Lys Ala 
1495 



atg gag aca 4374 

Met Glu Thr 

1275 

act gca gag 4419 

Thr Ala Glu 

1290 

gaa gtc act 4464 

Glu Val Thr 

1305 

gtc aaa ata 4509 

Val Lys He 

1320 

caa ggc aag 4554 

Gin Gly Lys 

1335 

gac tgc tgc 4599 

Asp Cys Cys 

1350 

cct gat gga 4 644 

Pro Asp Gly 

1365 

gtg ccc agg 4689 

Val Pro Arg 

1380 

caa gaa aac 4734 

Gin Glu Asn 

1395 

gta aaa gca 4779 

Val Lys Ala 

1410 

cat caa gaa 4824 

His Gin Glu 

1425 

aac act act 4869 

Asn Thr Thr 

1440 

gec ace ata 4914 

Ala Thr He 

1455 

ctg act agg 4959 

Leu Thr Arg 

1470 

aaa gat gta 5004 

Lys Asp Val 

1485 

atg atg cat 5049 

Met Met His 

1500 
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gac ttg cag ate acg tgc agt gag atg cag caa aaa gtt gaa ctt 5094 

Asp Leu Gin He Thr Cys Ser Glu Met Gin Gin Lys Val Glu Leu 
1505 1510 1515 

ctg aga tat gaa tct gaa aag ctt caa cag gaa aat tct att ttg 5139 

Leu Arg Tyr Glu Ser Glu Lys Leu Gin Gin Glu Asn Ser He Leu 
1520 1525 1530 

aga aat gaa att act act tta aat gaa gaa gat age att tct aac 5184 

Arg Asn Glu He Thr Thr Leu Asn Glu Glu Asp Ser He Ser Asn 
1535 1540 1545 

ctg aaa tta ggg aca tta aat gga tct cag gaa gaa atg tgg caa 5229 

Leu Lys Leu Gly Thr Leu Asn Gly Ser Gin Glu Glu Met Trp Gin 
1550 1555 1560 

aaa acg gaa act gta aaa caa gaa aat get gca gtt cag aag atg 527 4 

Lys Thr Glu Thr Val Lys Gin Glu Asn Ala Ala Val Gin Lys Met 
1565 1570 1575 

gtt gaa aat tta aag aaa cag att tea gaa tta aaa ate aaa aac 5319 

Val Glu Asn Leu Lys Lys Gin He Ser Glu Leu Lys He Lys Asn 
1580 1585 1590 



caa caa ttg gat ttg gaa aat aca gaa ctt age caa aag aac tct 
Gin Gin Leu Asp Leu Glu Asn Thr Glu Leu Ser Gin Lys Asn Ser 
1595 1600 1605 



5364 



caa aac cag gaa aaa ctg caa gaa ctt aat caa cgt eta aca gaa 5409 

Gin Asn Gin Glu Lys Leu Gin Glu Leu Asn Gin Arg Leu Thr Glu 
1610 1615 1620 

atg eta tgc cag aag gaa aaa gag cca gga aac agt gca ttg gag 5454 

Met Leu Cys Gin Lys Glu Lys Glu Pro Gly Asn Ser Ala Leu Glu 
1625 1630 1635 

gaa egg gaa caa gag aag ttt aat ctg aaa gaa gaa ctg gaa cgt 54 99 

Glu Arg Glu Gin Glu Lys Phe Asn Leu Lys Glu Glu Leu Glu Arg 

1640 1645 1650 

tgt aaa gtg cag tec tec act tta gtg tct tct ctg gag gcg gag 5544 

Cys Lys Val Gin Ser Ser Thr Leu Val Ser Ser Leu Glu Ala Glu 

1655 1660 1665 

etc tct gaa gtt aaa ata cag acc cat att gtg caa cag gaa aac 5589 

Leu Ser Glu Val Lys He Gin Thr His He Val Gin Gin Glu Asn 
1670 1675 1680 

cac ctt etc aaa gat gaa ctg gag aaa atg aaa cag ctg cac aga 5634 

His Leu Leu Lys Asp Glu Leu Glu Lys Met Lys Gin Leu. His Arg 
1685 1690 1695 

tgt ccc gat etc tct gac ttc cag caa aaa ate tct agt gtt eta 567 9 

Cys Pro Asp Leu Ser Asp Phe Gin Gin Lys He Ser Ser Val Leu 
1700 1705 1710 

age tac aac gaa aaa ctg ctg aaa gaa aag gaa get ctg agt gag 5724 

Ser Tyr Asn Glu Lys Leu Leu Lys Glu Lys Glu Ala Leu Ser Glu 
1715 1720 1725 

gaa tta aat age tgt gtc gat aag ttg gca aaa tea agt ctt tta 5769 

Glu Leu Asn Ser Cys Val Asp Lys Leu Ala Lys Ser Ser Leu Leu 
1730 1735 1740 
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gag cat aga att gcg acg atg aag cag gaa cag aaa tec tgg gaa 
Glu His Arg He Ala Thr Met Lys Gin Glu Gin Lys Ser Trp Glu 
1745 1750 1755 



5814 



cat cag agt gcg age tta aag tea cag ctg gtg get tct cag gaa 5859 

His Gin Ser Ala Ser Leu Lys Ser Gin Leu Val Ala Ser Gin Glu 
1760 1765 1770 

aag gtt cag aat tta gaa gac acc gtg cag aat gta aac ctg caa 5904 

Lys Val Gin Asn Leu Glu Asp Thr Val Gin Asn Val Asn Leu Gin 
1775 1780 1785 



atg tec egg atg aaa tct gac eta cga gtg act cag cag gaa aag 
Met Ser Arg Met Lys Ser Asp Leu Arg Val Thr Gin Gin Glu Lys 
1790 1795 1800 

gag get tta aaa caa gaa gtg atg tct tta cat aag caa ctt cag 
Glu Ala Leu Lys Gin Glu Val Met Ser Leu His Lys Gin Leu Gin 
1805 1810 1815 

aat get ggt ggc aag age tgg gec cca gag ata get act cat cca 
Asn Ala. Gly Gly Lys Ser Trp Ala Pro Glu He Ala Thr His Pro 
1820 1825 1830 

tea ggg etc cat aac cag cag aaa agg ctg tec tgg gac aag ttg 
Ser Gly Leu His Asn Gin Gin Lys Arg Leu Ser Trp Asp Lys Leu 
1835 1840 1845 

gat cat ctg atg aat gag gaa cag cag ctg ctt tgg caa gag aat 
Asp His Leu Met Asn Glu Glu Gin Gin Leu Leu Trp Gin Glu Asn 
1850 1855 I860 

gag agg etc cag acc atg gta cag aac acc aaa gec gaa etc acg 
Glu Arg Leu Gin Thr Met Val Gin Asn Thr Lys Ala Glu Leu Thr 
1865 1870 1875 

cac tec egg gag aag gtc cgt caa ttg gaa tec aat ctt ctt ccc 
His Ser Arg Glu Lys Val Arg Gin Leu Glu Ser Asn Leu Leu Pro 
1880 1885 1890 

aag cac caa aaa cat eta aac cca tea ggt acc atg aat ccc aca 
Lys His Gin Lys His Leu Asn Pro Ser Gly Thr Met Asn Pro Thr 
1895 1900 1905 

gag caa gaa aaa ttg age tta aag aga gag tgt gat cag ttt cag 
Glu Gin Glu Lys Leu Ser Leu Lys Arg Glu Cys Asp Gin Phe Gin 
1910 1915 1920 

aaa gaa caa tct cct get aac agg aag gtc agt cag atg aat tec 
Lys Glu Gin Ser Pro Ala Asn Arg Lys Val Ser Gin Met Asn Ser 
1925 1930 1935 

ctt gaa caa gaa tta gaa aca att cat ttg gaa aat gaa ggc ctg 
Leu Glu Gin Glu Leu Glu Thr He His Leu Glu Asn Glu Gly Leu 
1940 1945 1950 

aaa aag aaa caa gta aaa ctg gat gag cag etc atg gag atg cag 
Lys Lys Lys Gin Val Lys Leu Asp Glu Gin Leu Met Glu Met Gin 
1955 1960 1965 

cac ctg agg tec act gcg acg cct age ccg tec cct cat get tgg 
His Leu Arg Ser Thr Ala Thr Pro Ser Pro Ser Pro His Ala Trp 
1970 1975 1980 



5949 



5994 



6039 



6084 



6129 



6174 



6219 



6264 



6309 



6354 



6399 



6444 



6489 
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gat ttg cag ctg etc cag cag caa gec tgt ccg atg gtg ccc agg 

Asp Leu Gin Leu Leu Gin Gin Gin Ala Cys Pro Met Val Pro Arg 

1985 1990 1995 

gag cag ttt ctg cag ctt caa cgc cag ctg ctg cag gca gaa agg 

Glu Gin Phe Leu Gin Leu Gin Arg Gin Leu Leu Gin Ala Glu Arg 

2000 2005 2010 

ata aac cag cac ctg cag gag gaa ctt gaa aac agg acc tec gaa 

lie Asn Gin His Leu Gin Glu Glu Leu Glu Asn Arg Thr Ser Glu 

2015 2020 2025 

acc aac aca cca cag gga aac cag gaa caa ctg gta act gtc atg 

Thr Asn Thr Pro Gin Gly Asn Gin Glu Gin Leu Val Thr Val Met 

2030 2035 2040 



agg ctt ctt caa gag aaa gtg aat cag etc aaa gaa caa etc tgc 
Arg Leu Leu Gin Glu Lys Val Asn Gin Leu Lys Glu Gin Leu Cys 
2060 2065 2070 

aag aac act aag gca gac gca atg gtg aag gac ttg tat gtt gaa 
Lys Asn Thr Lys Ala Asp Ala Met Val Lys Asp Leu Tyr Val Glu 
2075 2080 2085 

aat gee cag ttg ttg aaa get ctg gaa gtg act gaa cag cga cag 
Asn Ala Gin Leu Leu Lys Ala Leu Glu Val Thr Glu Gin Arg Gin 
2090 2095 2100 

aaa aca gca gag aag aaa aat tac etc ctg gag gag aag att gee 
Lys Thr Ala Glu Lys Lys Asn Tyr Leu Leu Glu Glu Lys lie Ala 
2105 2110 2115 

age etc agt aat ata gtt agg aat ctg aca cca gcg cca ttg act 
Ser Leu Ser Asn lie Val Arg Asn Leu Thr Pro Ala Pro Leu Thr 
2120 2125 " 2130 

tct aca cct cct ttg agg tea tagccaaacc aaagggtaca ctcatatttg 
Ser Thr Pro Pro Leu Arg Ser 
2135 



6534 



6579 



6624 



6669 



gag gaa cga atg ata gaa gtt gaa cag aaa ctg aaa eta gtg aaa 6714 
Glu Glu Arg Met lie Glu Val Glu Gin Lys Leu Lys Leu Val Lys 
' 2045 2050 2055 



6759 



6804 



6849 



6894 



6939 



6990 



7050 



tgeaetttae tgaaatagat gaacatttca gtaggttctc aacttaaaat taagectaac 

etaaaactge cagcaacaca actggagttt ccatttatca taattagttt ttctaaatag 7110 

acccttatgg gagtttgaaa ataaatactc acatatttca ctacttaaat tattcccaag 7170 

atttgaattt attttaaaat tttaatagee accaagaatg tggacatatg aaaattcaag 7230 

aacctaaaaa ataccagttt tgaatgagtt tttgtggttt tggtttttta attattacaa 7290 

atctatgtgt aaaatctaga tatttgaagt ttgagatctg atgagaatgg ttgttataaa 7350 

ctttatttta aaaccaaatt taggtgttct tacatattta aatactggaa agtcattata 7 410 

atagttttgg ttctttgaat tggtagacaa ttagtagagt ataattggtt aggaggcagg 7470 

gcttattaag tggttattaa ccgctgacat cagacaaacc caaatctgta gaattctaac 7 530 

ctcctaacac ctgtgacagt attaccactc ttcttgtatt atagatttag aactgattta 7590 
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ctcaattgca ctcttaacta atgttaaaag cttacttgct ttaaacagcc ttttcttctt 7650 

tctcttaaaa gtttcatttg gggagctggt cttctaagaa acggataaag ccacataatt 7710 

aaagcagttg aactagaggg aaagcactga acaaaccact ttggagtaaa tagctactct 7770 

tagaaaagag ggataagcag accatgtagg ttttctgtct ctcaaatctt agagttcata 7830 

aatttacttg aggttgcctc aagaactcag ggaacaatac tgtaaactgt cttcctgaac 7890 

tactgtaggg cctctctaag aatttgaaat gtataaacca tgtgacctca tttatttgtc 7 950 

ttatatattt acagccatac tagaattttt atttctacgt ttttagtaaa tttaatattc 8010 

tgggggaaaa aaggccttga ttttagggtt aaaaacctga cttatagaag agtttattta 8070 

atataggtca aaattttctg tgtttcttat tccttctata cctcaaatct gattctaaga 8130 

atttcttact gtgataatca ttggcatgcc acctgaggtc aaggagtgcc aaataggact 8190 

ttccactcat gctcaagatc aaaactttat agaacagtca acattttaga ttcggtaacc 8250 

ttttttttct tccaattata atctctgctt ctagccactt ccgccagcag ttggtggaag 8310 

acttactagg tgcagggcac tttccaagtt catcacaaca acctgcttgt tttcatgaga 8370 

caataatccg aaaagttcgc tttgatatat tcctggaggg ccaagcccat ctatttacaa 8430 

aaggtgaaca gcaaaatcaa gcactgcttt atgggcagga acacaagaga aagcaaactg 84 90 

cccaagaagt catcatgtca gaaactcaat ctcaacaaaa taatttccat cagggaactt 8550 

cagggtttct tgggggctta tgagtctcac cggtcaaccc aggaggcctc actacaagag 8 610 

ccttgacaag gcactgtttt ttgtgggact gggagttcac actgatgaag caaacctttg 8670 

aatttttgca cagctcttgt cagaaagccc tgagttcccc ctggataaag agttaatttt 87 30 

aatccttccc tataattata cttcaaaata tttgacatct gctattatgc cttctttaga 8790 

tctttcttct gcggtgcaga catttctagt aagtgtttga ctacttgtat ggcattagct 8850 

ttcacagaaa attgtttcac ttaaaactgt ggattggcct aggctaagga caaaaataaa 8 910 

ctaagtacct gtagtgtatt tatgtgatat gtgtcaagtt actcaaagtt attgctgttg 8970 

gaactgaaca ataatatttc ccagatagct ggccttagca tgtgatcacg gttgttgtat 9030 

ttttaatttt tgtcttttac agtatgagag gtgtaggtta atttgtttat ttcctataaa 9090 

tttgtattta tgtgtatata aaatgtacaa tgaatgtaaa tatgactttc tggaaagttt 9150 

agactacatt tagaatctct attcaaaatc aaaatgctgc tcaaatgaat ttaaccaaca 9210 

tctaggtgct taatttctca ttttatccca cttatgagat tgggaaaaag atcaatatga 9270 

gaaataccat acagatacct taaatgtatg catttgtgca acaatttttg agaaggtgag 9330 

tggcaattta taatttagtt ggcaatttat aatagaactt atagctttta aaagactttt 9390 

taaagacatt aaatgtaaac ttaaaaatgt ttagatcttg tttcaaactt tacaatagca 9450 

ttcttcaaaa tattaagtta tatattttat aggcatttag ttgcttatta aaagcactga 9510 

ttttcaaact ttttgattta agaacaatta tttaagatcg tctcagaaga tgggatcttc 9570 
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gtttcaagaa aagggaatca agtttgcctt tgagataata cgttacacta agaaaaggaa 9630 

aatgtggata gtaaaaccca cctctctcat cctattgtac tctcttctgc tttttagaag 9690 

cctgcactta agcttagatt tgtgaaggga gagtagaagg ggagaagtag aaccacagtg 9750 

ttttatttat ttttctaaaa ctcttactaa atccagattt tttaaactgt tttaaatgtg 9810 

aattcttccc agaaatttca atgcattgca tatttagcct tcggcatatt tttcatgaat 9870 

agatcatgaa gtcataggct tccaaggcat aggaagagat cttgcaggtc tagtatttta 9930 

ataatgcact attacccagg gcagatatta tgagaaactg tttcttctct aagggtttat 9990 

ggcagacttt gcttttttaa catgtgagaa atgaattttt tattttgtga tttatgtgat 10050 

ttcttttgct gagtgaagga aaggagaaat tgttgctatt gtcagcatct taaaggtatt 10110 

tccagtcaag gcaaggctaa gtgctttgtg atagtattaa gcaagtcatg ttttgaatgg 10170 

attacctgta gtgactcatt ggaatgatat aattatacaa gtaatgccaa aaaccaagtc 10230 

aaagcctaat taaccaaagc actcatttaa aaatcatcat gtttggacct atctggacct 10290 

ctcagcactg taaaatagtt ttggttttgt ggcatatgaa tagctgttta acaaatcaaa 10350 

gttagctttt tgcttctcag cttttttggg caatacaagt taagttctta atggggagac 10410 
attatcatgg catgacttaa gggaacattg gtttgtgaag gaaaaacaga ttatctaaag 10470 

ccatctctat gtttctgttc agataaagat taatgagttc tgtgtttata tcagctttgt 10530 

atatttcatc ttagccattc tatcctagaa agattttaat gtgagcttaa gatgtaaata 10590 

aataattttg caaacatgaa aaaaaaaaaa aaaaa 10625 

<210> 4 

<211> 2139 

<212> PRT 

<213> Homo sapiens 

<400> 4 

Met Ala Glu Val Thr Val Pro Arg Val Tyr Val Val Phe Gly lie His 
1 5 10 15 

Cys He Met Ala Lys Ala Ser Ser Asp Val Gin Val Ser Gly Phe His 
20 25 30 

Arg Lys lie Gin His Val Lys Asn Glu Leu Cys His Met Leu Ser Leu 
35 40 45 

Glu Glu Val Ala Pro Val Leu Gin Gin Thr Leu Leu Gin Asp Asn Leu 
50 55 60 

Leu Gly Arg Val His Phe Asp Gin Phe Lys Glu Ala Leu He Leu He 
65 70 75 80 

Leu Ser Arg Thr Leu Ser Asn Glu Glu His Phe Gin Glu Pro Asp Cys 
85 90 95 

Ser Leu Glu Ala Gin Pro Lys Tyr Val Arg Gly Gly Lys Arg Tyr Gly 
100 105 110 

Arg Arg Ser Leu Pro Glu Phe Gin Glu Ser Val Glu Glu Phe Pro Glu 
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115 



120 



125 



Val Thr Val lie Glu Pro Leu Asp Glu Glu Ala Arg Pro Ser His lie 
130 135 140 . 

Pro Ala Gly Asp Cys Ser Glu His Trp Lys Thr Gin Arg Ser Glu Glu 
145 150 155 160 

Tyr Glu Ala Glu Gly Gin Leu Arg Phe Trp Asn Pro Asp Asp Leu Asn 
165 170 175 

Ala Ser Gin Ser Gly Ser Ser Pro Pro Gin Asp Trp lie Glu Glu Lys 
180 185 190 

Leu Gin Glu Val Cys Glu Asp Leu Gly lie Thr Arg Asp Gly His Leu 
195 200 205 

Asn Arg Lys Lys Leu Val Ser lie Cys Glu Gin Tyr Gly Leu Gin Asn 
210 215 220 

Val Asp Gly Glu Met Leu Glu Glu Val Phe His Asn Leu Asp Pro Asp 
225 230 235 240 

Gly Thr Met Ser Val Glu Asp Phe Phe Tyr Gly Leu Phe Lys Asn Gly 

245 250 255 

Lys Ser Leu Thr Pro Ser Ala Ser Thr Pro Tyr Arg Gin Leu Lys Arg 
260 265 270 

His Leu Ser Met Gin Ser Phe Asp Glu Ser Gly Arg Arg Thr Thr Thr 
275 280 285 

Ser Ser Ala Met Thr Ser Thr lie Gly Phe Arg Val Phe Ser Cys Leu 
290 295 300 

Asp Asp Gly Met Gly His Ala Ser Val Glu Arg He Leu Asp Thr Trp 
305 310 315 320 

Gin Glu Glu Gly He Glu Asn Ser Gin Glu He Leu Lys Ala Leu Asp 
325 330 335 

Phe Ser Leu Asp Gly Asn He Asn Leu Thr Glu Leu Thr Leu Ala Leu 
340 345 350 

Glu Asn Glu Leu Leu Val Thr Lys Asn Ser He His Gin Ala Ala Leu 
355 360 365 

Ala Ser Phe Lys Ala Glu lie Arg His Leu Leu Glu Arg Val Asp Gin 
370 375 380 

Val Val Arg Glu Lys Glu Lys Leu Arg Ser Asp Leu Asp Lys Ala Glu 
385 390 395 400 

Lys Leu Lys Ser Leu Met Ala Ser Glu Val Asp Asp His His Ala Ala 
405 410 415 

He Glu Arg Arg Asn Glu Tyr Asn Leu Arg Lys Leu Asp Gly Glu Tyr 
420 425 430 

Lys Glu Arg He Ala Ala Leu Lys Asn Glu Leu Arg Lys Glu Arg Glu 
435 440 445 

Gin He Leu Gin Gin Ala Gly Lys Gin Arg Leu Glu Leu Glu Gin Glu 
450 455 460 
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lie Glu Lys Ala Lys Thr Glu Glu Asn Tyr He Arg Asp Arg Leu Ala 
465 470 475 480 

Leu Ser Leu Lys Glu Asn Ser Arg Leu Glu Asn Glu Leu Leu Glu Asn 
485 490 495 

Ala Glu Lys Leu Ala Glu Tyr Glu Asn Leu Thr Asn Lys Leu Gin Arg 
500 505 510 

Asn Leu Glu Asn Val Leu Ala Glu Lys Phe Gly Asp Leu Asp Pro Ser 
515 520 525 

Ser Ala Glu Phe Phe Leu Gin Glu Glu Arg Leu Thr Gin Met Arg Asn 
530 535 540 

Glu Tyr Glu Arg Gin Cys Arg Val Leu Gin Asp Gin Val Asp Glu Leu 
545 550 555 560 

Gin Ser Glu Leu Glu Glu Tyr Arg Ala Gin Gly Arg Val Leu Arg Leu 
565 570 575 

Pro Leu Lys Asn Ser Pro Ser Glu Glu Val Glu Ala Asn Ser Gly Gly 
580 585 590 

He Glu Pro Glu His Gly Leu Gly Ser Glu Glu Cys Asn Pro Leu Asn 
595 600 605 



Met Ser He Glu Ala Glu Leu Val He Glu Gin Met Lys Glu Gin His 
610 615 620 

His Arg Asp He Cys Cys Leu Arg Leu Glu Leu Glu Asp Lys Val Arg 
625 630 635 640 

His Tyr Glu Lys Gin Leu Asp Glu Thr Val Val Ser Cys Lys Lys Ala 
645 650 655 

Gin Glu Asn Met Lys Gin Arg His Glu Asn Glu Thr Arg Thr Leu Glu 
660 665 1 670 

Lys Gin He Ser Asp Leu Lys Asn Glu He Ala Glu Leu Gin Gly Gin 
675 680 685 

Ala Ala Val Leu Lys Glu Ala His His Glu Ala Thr Cys Arg His Glu 
690 695 700 

Glu Glu Lys Lys Gin Leu Gin Val Lys Leu Glu Glu Glu Lys Thr His 
705 710 715 720 

Leu Gin Glu Lys Leu Arg Leu Gin His Glu Met Glu Leu Lys Ala Arg 
725 730' 735 

Leu Thr Gin Ala Gin Ala Ser Phe Glu Arg Glu Arg Glu Gly Leu Gin 
740 745 750 

Ser Ser Ala Trp Thr Glu Glu Lys Val Arg Gly Leu Thr Gin Glu Leu 
755 760 765 

Glu Gin Phe His Gin Glu Gin Leu Thr Ser Leu Val Glu Lys His Thr 
770 775 780 

Leu Glu Lys Glu Glu Leu Arg Lys Glu Leu Leu Glu Lys His Gin Arg 
785 790 795 800 

Glu Leu Gin Glu Gly Arg Glu Lys Met Glu Thr Glu Cys Asn Arg Arg 
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805 810 815 

Thr Ser Gin He Glu Ala Gin Phe Gin Ser Asp Cys Gin Lys Val Thr 
820 825 830 

Glu Arg Cys Glu Ser Ala Leu Gin Ser Leu Glu Gly Arg Tyr Arg Gin 
835 840 845 

Glu Leu Lys Asp Leu Gin Glu Gin Gin Arg Glu Glu Lys Ser Gin Trp 
850 855 860 

Glu Phe Glu Lys Asp Glu Leu Thr Gin Glu Cys Ala Glu Ala Gin Glu 
865 870 875 880 

Leu Leu Lys Glu Thr Leu Lys Arg Glu Lys Thr Thr Ser Leu Val Leu 
885 890 895 

Thr Gin Glu Arg Glu Met Leu Glu Lys Thr Tyr Lys Glu His Leu Asn 

900 905 910 

Ser Met Val Val Glu Arg Gin Gin Leu Leu Gin Asp Leu Glu Asp Leu 
915 920 925 

Arg Asn Val Ser Glu Thr Gin Gin Ser Leu Leu Ser Asp Gin He Leu 
930 935 940 

Glu Leu Lys Ser Ser His Lys Arg Glu Leu Arg Glu Arg Glu Glu Val 
945 950 955 960 

Leu Cys Gin Ala Gly Ala Ser Glu Gin Leu Ala Ser Gin Arg Leu Glu 
965 970 975 

Arg Leu Glu Met Glu His Asp Gin Glu Arg Gin Glu Met Met Ser Lys 
980 985 990 

Leu Leu Ala Met Glu Asn He His Lys Ala Thr Cys Glu Thr Ala Asp 
995 1000 1005 

Arg Glu Arg Ala Glu Met Ser Thr Glu He Ser Arg Leu Gin Ser 
1010 1015 1020 

Lys He Lys Glu Met Gin Gin Ala Thr Ser Pro Leu Ser Met Leu 
1025 1030 1035 

Gin Ser Gly Cys Gin Val He Gly Glu Glu Glu Val Glu Gly Asp 
1040 1045 1050 

Gly Ala Leu Ser Leu Leu Gin Gin Gly Glu Gin Leu Leu Glu Glu 
1055 1060 1065 

Asn Gly Asp Val Leu Leu Ser Leu Gin Arg Ala His Glu Gin Ala 
1070 1075 1080 

Val Lys Glu Asn Val Lys Met Ala Thr Glu He Ser Arg Leu Gin 
1085 1090 1095 

Gin Arg Leu Gin Lys Leu Glu Pro Gly Leu Val Met Ser Ser Cys 
1100 1105 1H0 

Leu Asp Glu Pro Ala Thr Glu Phe Phe Gly Asn Thr Ala Glu Gin 
1115 1120 1125 

Thr Glu Gin Phe Leu Gin Gin Asn Arg Thr Lys Gin Val Glu Gly 
1130 H35 1140 
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Val Thr Arg Arg His Val Leu Ser Asp Leu Glu Asp Asp Glu Val 
1145 1150 1155 

Arg Asp Leu Gly Ser Thr Gly Thr Ser Ser Val Gin Arg Gin Glu 
1160 H65 1170 

Val Lys lie Glu Glu Ser Glu Ala Ser Val Glu Gly Phe Ser Glu 
1175 H80 1185 

Leu Glu Asn Ser Glu Glu Thr Arg Thr Glu Ser Trp Glu Leu Lys 
1190 1195 1200 

Asn Gin lie Ser Gin Leu Gin Glu Gin Leu Met Met Leu Cys Ala 
1205 1210 1215 

Asp Cys Asp Arg Ala Ser Glu Lys Lys Gin Asp Leu Leu Phe Asp 
1220 1225 1230 

Val Ser Val Leu Lys Lys Lys Leu Lys Met Leu Glu Arg lie Pro 
1235 1240 1245 

Glu Ala Ser Pro Lys Tyr Lys Leu Leu Tyr Glu Asp Val Ser Arg 
1250 1255 1260 

Glu Asn Asp Cys Leu Gin Glu Glu Leu Arg Met Met Glu Thr Arg 
1265 1270 1275 

Tyr Asp Glu Ala Leu Glu Asn Asn Lys Glu Leu Thr Ala Glu Val 
1280 1285 1290 

Phe Arg Leu Gin Asp Glu Leu Lys Lys Met Glu Glu Val Thr Glu 
1295 1300 1305 

Thr Phe Leu Ser Leu Glu Lys Ser Tyr Asp Glu Val Lys lie Glu 
1310 1315 1320 

Asn Glu Gly Leu Asn Val Leu Val Leu Arg Leu Gin Gly Lys lie 
1325 1330 1335 

Glu Lys Leu Gin Glu Ser Val Val Gin Arg Cys Asp Cys Cys Leu 
1340 1345 1350 

Trp Glu Ala Ser Leu Glu Asn Leu Glu He Glu Pro Asp Gly Asn 
1355 1360 1365 

He Leu Gin Leu Asn Gin Thr Leu Glu Glu Cys Val Pro Arg Val 
1370 1375 1380 

Arg Ser Val His His Val He Glu Glu Cys Lys Gin Glu Asn Gin 
1385 1390 1395 

Tyr Leu Glu Gly Asn Thr Gin Leu Leu Glu Lys Val Lys Ala His 
1400 1405 1410 

Glu He Ala Trp Leu His Gly Thr He Gin Thr His Gin Glu Arg 
1415 1420 1425 

Pro Arg Val Gin Asn Gin Val lie Leu Glu Glu Asn Thr Thr Leu 
1430 1435 1440 

Leu Gly Phe Gin Asp Lys His Phe Gin His Gin Ala Thr He Ala 
1445 1450 1455 

Glu Leu Glu Leu Glu Lys Thr Lys Leu Gin Glu Leu Thr Arg Lys 
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1460 1465 1470 

Leu Lys Glu Arg Val Thr He Leu Val Lys Gin Lys Asp Val Leu 
1475 1480 1485 

Ser His Gly Glu Lys Glu Glu Glu Leu Lys Ala Met Met His Asp 
1490 1495 1500 

Leu Gin He Thr Cys Ser Glu Met Gin Gin Lys Val Glu Leu Leu 
1505 1510 1515 

Arg Tyr ■ Glu Ser Glu Lys Leu Gin Gin Glu Asn Ser He Leu Arg 
1520 1525 1530 

Asn Glu He Thr Thr Leu Asn Glu Glu Asp Ser He Ser Asn Leu 
1535 1540 1545 

Lys Leu Gly Thr Leu Asn Gly Ser Gin Glu Glu Met Trp Gin Lys 
1550 1555 1560 

Thr Glu Thr Val Lys Gin Glu Asn Ala Ala Val Gin Lys Met Val 
1565 1570 1575 

Glu Asn Leu Lys Lys Gin He Ser Glu Leu Lys lie Lys Asn Gin 
1580 1585 1590 

Gin Leu Asp Leu Glu Asn Thr -Glu Leu Ser Gin Lys Asn Ser Gin 
1595 1600 1605 

Asn Gin Glu Lys Leu Gin Glu Leu Asn Gin Arg Leu Thr Glu Met 
1610 1615 1620 

,Leu Cys Gin Lys Glu Lys Glu Pro Gly Asn Ser Ala Leu Glu Glu 
1625 1630 1635 

Arg Glu Gin Glu Lys Phe Asn Leu Lys Glu Glu Leu Glu Arg Cys 
1640 1645 ■ 1650 

Lys Val Gin Ser Ser Thr Leu Val Ser Ser Leu Glu Ala Glu Leu 
1655 1660 1665 

Ser Glu Val Lys He Gin Thr His lie Val Gin Gin Glu Asn His 
1670 1675 1680 

Leu Leu Lys Asp Glu Leu Glu Lys Met Lys Gin Leu His Arg Cys 
1685 . 1690 1695 

Pro Asp Leu Ser Asp Phe Gin Gin Lys He Ser Ser Val Leu Ser 
1700 1705 1710 

Tyr Asn Glu Lys Leu Leu Lys Glu Lys Glu Ala Leu Ser Glu Glu 
1715 1720 1725 

Leu Asn Ser Cys Val Asp Lys Leu Ala Lys Ser Ser Leu Leu Glu 
1730 1735 1740 

His Arg lie Ala Thr Met Lys Gin Glu Gin Lys Ser Trp Glu His 
1745 1750 1755 

Gin Ser Ala Ser Leu Lys Ser Gin Leu Val Ala Ser Gin Glu Lys 
1760 1765 1770 

Val Gin Asn Leu Glu Asp Thr Val Gin Asn Val Asn Leu Gin Met 
1775 1780 1785 
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Ser Arg Met Lys Ser Asp Leu Arg Val Thr Gin Gin Glu Lys Glu 
1790 1795 1800 

Ala Leu Lys Gin Glu Val Met Ser Leu His Lys Gin Leu Gin Asn 
1805 1810 1815 

Ala Gly Gly Lys Ser Trp Ala Pro Glu lie Ala Thr His Pro Ser 
1820 1825 1830 

Gly Leu His Asn Gin Gin Lys Arg Leu Ser Trp Asp Lys Leu Asp 
1835 1840 1845 

His Leu Met Asn Glu Glu Gin Gin Leu Leu Trp Gin Glu Asn Glu 
1850 1855 1860 

Arg Leu Gin Thr Met • Val Gin Asn Thr Lys Ala Glu Leu Thr- His 
1865 1870 1875 

Ser Arg Glu Lys Val Arg Gin Leu Glu Ser Asn Leu Leu Pro Lys 
1880 1885 1890 

His Gin Lys His Leu Asn Pro Ser Gly Thr Met Asn Pro Thr Glu 
1895 1900 1905 

Gin Glu Lys Leu Ser Leu Lys Arg Glu C'ys Asp Gin Phe Gin Lys 
1910 1915 1920 

Glu Gin Ser Pro Ala Asn Arg Lys Val Ser Gin Met Asn Ser Leu 
1925 1930 1935 

Glu Gin Glu Leu Glu Thr He His Leu Glu Asn Glu Gly Leu Lys 
1940 1945 1950 

Lys Lys Gin Val Lys Leu Asp Glu Gin Leu Met Glu Met Gin His 
1955 1960 1965 

Leu Arg Ser Thr Ala Thr Pro Ser Pro Ser Pro His Ala Trp Asp 
1970 1975 1980 

Leu Gin Leu Leu Gin Gin Gin Ala Cys Pro Met Val Pro Arg Glu 
1985 1990 1995 

Gin Phe Leu Gin Leu Gin Arg Gin Leu Leu Gin Ala Glu Arg lie 
2000 2005 2010 

Asn Gin His Leu Gin Glu Glu Leu Glu Asn Arg Thr Ser Glu Thr 
2015 2020 2025 

Asn Thr Pro Gin Gly Asn Gin Glu Gin Leu Val Thr Val Met Glu 
2030 2035 2040 

Glu Arg Met He Glu Val Glu Gin Lys Leu Lys Leu Val Lys Arg 
2045 2050 2055 

Leu Leu Gin Glu Lys Val Asn Gin Leu Lys Glu Gin Leu Cys .Lys 
2060 2065 2070 

Asn Thr Lys Ala Asp Ala Met Val Lys Asp Leu Tyr Val Glu Asn 
2075 2080 2085 

Ala Gin Leu Leu Lys Ala Leu Glu Val Thr Glu Gin Arg Gin Lys 
2090 2095 2100 

Thr Ala Glu Lys Lys Asn Tyr Leu Leu Glu Glu Lys He Ala Ser 
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2105 



2110 



2115 



Leu Ser Asn lie Val Arg Asn Leu Thr Pro Ala Pro Leu Thr Ser 
2120 2125 2130 

Thr Pro Pro Leu Arg Ser 
2135 



